In this recipe, we will learn how to plot country-wise data on a world map.
We will use a few different additional packages for this recipe. We need the maps
package for the actual drawing of the maps, the WDI
package to get the World Bank data by countries, and the RColorBrewer
package for color schemes. So, let's make sure these packages are installed and loaded:
install.packages("maps") library(maps) install.packages("WDI") library(WDI) install.packages("RColorBrewer") library(RColorBrewer)
There are a lot of different data we can pull in using the world bank API provided by the WDI
package. In this example, let's plot some CO2 emissions data:
colours = brewer.pal(7,"PuRd") wgdp<-WDIsearch("gdp") w<-WDI(country="all", indicator=wgdp[4,1], start=2005, end=2005) w[63,1] <- "USA" x<-map(plot=FALSE) x$measure<-array(NA,dim=length(x$names)) for(i in 1:length(w$country)) { for(j in 1:length(x$names)) { if(grepl(w$country[i],x$names[j], x$measure[j]<-w[i,3] } } sd <- data.frame(col=colours, values <- seq(min(x$measure[!$measure)]), max(x$measure[!$measure)]) *1.0001, length.out=7)) sc<-array("#FFFFFF",dim=length(x$names)) for (i in 1:length(x$measure)) if(!$measure[i])) sc[i]=as.character(sd$col[findInterval(x$measure[i], sd$values)]) #2-column layout with color scale to the right of the map layout(matrix(data=c(2,1), nrow=1, ncol=2), widths=c(8,1), heights=c(8,1)) # Color Scale first breaks<-sd$values par(mar = c(20,1,20,7),oma=c(0.2,0.2,0.2,0.2),mex=0.5) image(x=1, y=0:length(breaks),z=t(matrix(breaks))*1.001, col=colours[1:length(breaks)-1],axes=FALSE breaks=breaks,xlab="",ylab="",xaxt="n") axis(side=4,at=0:(length(breaks)-1), labels=round(breaks),col="white",las=1) abline(h=c(1:length(breaks)),col="white",lwd=2,xpd=F) #Map map(col=sc,fill=TRUE,lty="blank") # If you get a figure margins error while running the above code, enlarge the plot device or adjust the margins so that the graph and scale fit within the device. map(add=TRUE,col="gray",fill=FALSE) title("CO2 emissions (kg per 2000 US$ of GDP)")
We used the maps
package in combination with the world bank data from the WDI package above to plot CO2 emissions data per 2000 US$ of GDP for various countries across the world.
First, we chose an RColorBrewer
color scheme and saved it as a vector called colours
. We then pulled a list of GDP-related variables using the WDIsearch()
function. If you type in wgdp
at the R prompt and hit the Enter key, you will see a list of codes and descriptions of each of these variables. For the preceding example, we chose the fourth variable (wgdp[4,1]
), which gives CO2 emissions (kg per 2000 US$ of GDP), and passed it to the WDI()
function to get data for all countries for the year 2005 by setting the country
argument to "all"
and start
and end
to 2005
Next, we created an x
map object simply by calling the map()
function and setting plot
so that the map is not drawn yet. We did this so that we can map the data we pulled from WDI to the country polygons contained in the map
First, we added a new array called measure
to x
, with NA
as the default values and length, matching the number of country names in x
. If you type in x$names
at the R prompt and hit Enter, you will see the whole list of country names. Similarly, w$country
contains the names of the countries for which the WDI package has data. Note that the map
object has a lot more names because it contains region information in finer detail than just countries. So, we must first match the names of countries in the two datasets.
For the example, we use a simple search function, grepl()
, which looks for the WDI country names in the map object x
and assigns the corresponding CO2 emissions values from w
to x$measure
. This is a very approximate solution and misses on countries where the names in the two datasets are not the same. For example, the United States is named USA in the WDI dataset. To match all the countries exactly, we need to manually check the important ones we are interested in. In the example, the United States was corrected manually.
Next, we created a data frame called sd
to define a color scheme with intervals based on a sequence from the minimum to the maximum values in x$measure
. We use sd
to assign a color for each of the values in x$measure
by creating a vector called sc
. First, we create sc
with default values of white so that any missing values are depicted without any color. Then, we used the findInterval()
function to assign a color to each value of x$measure
Finally, we have all the ingredients to make the map. We first used the layout()
function to create a 1 x 2 layout just as we did for heat maps in the previous chapter.
We need to plot the color scale first here because, if we plot the map first, the scale cannot be plotted on the same layout and results in a new plot with just the scale. We reversed this plotting order by setting the data
argument in layout()
to c(2,1)
instead of c(1,2)
The color scale is drawn in exactly the same way as in the previous chapter for heat maps, using the image()
function. To draw the map itself, we used the map()
function. We set the col
argument to the sc
vector that contains colors corresponding to each polygon on the map. We set fill
and lty
to "blank"
so that we get the polygons filled with the specified colors and no blank borders around them. Instead, we add gray borders by calling the map()
function with add
set to TRUE
, col
set to gray
, and fill
set to FALSE
. Finally, we added a plot title using the title()
The example shows just one variable for one year visualized on a map. The world bank package gives 73 different metrics related to GDP alone (as can be seen in the wgdp
variable). See the help section for the WDI
package for more details about other data available (?WDI
and ?WDIsearch
). If you have any other data by country from another source, you can use that with the map()
function in the example as long as the country names can be matched to the names of regions in the map object.