Using R to locate spatial data points inside map polygons


I was looking into a paper called “Determinants of woody cover in African savannas” by Sankaran et al. (2005). The paper looks at the large scale environmental factors that affect percentage woodland cover in African savanna landscapes. One figure in particular that got me interested was this one:

Mean annual Precipitation vs. woody cover

I haven’t fully worked out the implications of this figure yet, but what stands out to me the most is that many plots in high rainfall areas with low woody cover are classed as ‘arid fertile savanna’ by White’s veg. classification. Secondly moist infertile savanna seems to straddle the saturation point of the MAP limited woody cover.

It shows that savannas with Mean Annual Precipitation values less than ~650 mm have their upper woody cover potential limited by precipitation, but above that threshold an increase in MAP doesn’t increase the maximum potential woody cover. It also shows that lots of sites have woody cover below their MAP limited maximum, pointing to lots of other environmental factors, like fire, herbivory, soil characteristics.

Looking into this graph more, I wanted to see whether there was any biogeographic patterns that could be drawn from the data. Were all the sites with particularly low actual woody cover from a particular woodland cover biome, for example.

To do this I compared the data from Sankaran et al. (2005), which is publicly available as supplementary information, to White’s seminal vegetation classification map of 1983, which I accessed as as shapefiles from here.

I did this analysis in R, so all the code below is for R.

You can find the code and data that I used by cloning this github repo.

First, I loaded the packages and the data:

# Set working directory to the location of the source file ----

# Packages ----

# Import data ----

## sankaran data
cover <- read.csv("data/sankaran_2005_data.csv")

## white 1983 veg data
white_veg <- readOGR(dsn="data/whitesveg",
	layer="Whites vegetation")

## Country outline
countries <- readOGR(dsn="data/africa",

First I can have a go at plotting White’s map:

# Plot White's veg map data ----
## Fortify country outline for ggplot
countries_fort <- fortify(countries, region = "COUNTRY")

## Exploring whiteveg

## Fortify white shape file for ggplot2
white_veg_fort <- fortify(white_veg, region = "DESCRIPTIO")

## Create colour palette for ggplot2
palette_veg_type_19 <- c("#FF4A46","#008941","#006FA6","#A30059","#FFDBE5",

## ggplot
ggplot() +
	geom_polygon(aes(x = long, y = lat, group = group, fill = id),
			data = white_veg_fort) +
	geom_polygon(aes(x = long, y = lat, group = group, fill = NA),
			colour = "black",
			data = countries_fort) +
	theme_classic()  +
	scale_fill_manual(values = palette_veg_type_19) +
	labs(fill = "Biome") +
	xlab("Longitude") +
	ylab("Latitude") +

Then I had to convert the data from Sankaran et al. into a SpatialPoints object so I can use it in future analyses.

# Create a data frame with only the latitude and longitude data 
cover_coords <- cover %>%
	select(lon_dec_deg, lat_dec_deg)  # Important for later on to have lon then lat as columns

# Convert to SpatialPoints object	
cover_spoints <- SpatialPoints(cover_coords,proj4string=CRS(proj4string(white_veg)))

The bit of this project that took me some time to work out was how to compute whether a data point from Sankaran et al. fell into a polygon of a certain type in White’s map. I ended up using over() from the sp package. Then I can add that data back into the original Sankaran dataset

# Add vegetation class column by referencing White map
cover_veg_class <- cover %>%
	mutate(veg_class = over(cover_spoints, white_veg)$DESCRIPTIO)

Finally, I can use cover_veg_class to create a ggplot of MAP vs woody cover, with the points coloured according to which of White’s vegetation classes the point is in:

# Plot with vegetation classification from White et al. 1983 ----
veg_class_plot <- ggplot(cover_veg_class, aes(x = map_mm,
			y = woody_cover_per,
			colour = veg_class)) +
	geom_point(size = 4) +
	guides(colour=guide_legend(title="Vegetation Class")) +
	xlab("MAP (mm)") +
	ylab("Woody Cover (%)")
MAP vs. woody cover with coloured points by vegetation type