如果您想使用ggplot2,那么您需要做的主要事情是将州缩写列映射到小写的完整州名(为此,您可以使用state.name,但要一定要在其上应用tolower() 以获得正确的格式)。
从那里,只需将您的数据集连接到该州的地理空间信息并绘制数据即可。以下代码段将带您逐步完成:
# First, we need the ggplot2 library:
> library(ggplot2)
# We load the geospatial data for the states
# (there are more options to the map_data function,
# if you are intrested in taking a look).
> states <- map_data("state")
# Here I'm creating a sample dataset like yours.
# The dataset will have 2 columns: The region (or state)
# and a number that will represent the value that you
# want to plot (here the value is just the numerical order of the states).
> sim_data <- data.frame(region=unique(states$region), Percent.Turnout=match(unique(states$region), unique(states$region)))
# Then we merge our dataset with the geospatial data:
> sim_data_geo <- merge(states, sim_data, by="region")
# The following should give us the plot without the numbers:
> qplot(long, lat, data=sim_data_geo, geom="polygon", fill=Percent.Turnout, group=group)
这是上面这段代码的输出:
现在,您说您还想将值 Percent.Turnout 添加到地图中。在这里,我们需要找到各种状态的中心点。您可以根据我们在上面检索到的地理空间数据(在states 数据框中)进行计算,但结果看起来不会很令人印象深刻。幸运的是,R 已经为我们计算了状态中心的值,我们可以利用它,如下所示:
# We'll use the state.center list to tell us where exactly
# the center of the state is.
> snames <- data.frame(region=tolower(state.name), long=state.center$x, lat=state.center$y)
# Then again, we need to join our original dataset
# to get the value that should be printed at the center.
> snames <- merge(snames, sim_data, by="region")
# And finally, to put everything together:
> ggplot(sim_data_geo, aes(long, lat)) + geom_polygon(aes(group=group, fill=Percent.Turnout)) + geom_text(data=snames, aes(long, lat, label=Percent.Turnout))
这是上面最后一条语句的输出: