Map any region in the world with R – Part III: Programming with ggplot2

2023-04-20 21:48| 来源: 网络整理| 查看: 265

Posted on April 18, 2023 by R with White Dwarf in R bloggers | 0 Comments

[This article was first published on R with White Dwarf, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. ShareTweet

You can find all the posts on this series under the tag maps-app (including the Spanish versions).

You can also find the current state of the project under my GitHub repo mapic.

Scope of this post

We are creating maps of data showing changes over a span of time for different countries and pointing at all kinds of cities. That basically means that we need to map any region of the world with R. Today there are all kinds of packages and techniques to do that. I will share the strategy I used with ggplot2 and maps packages, using support of Open Street Map to obtain the coordinates of cities and finally making it interactive with shiny.

This series of posts share my path towards the creation of the Shiny app. It is a live project and I decided to share my path and experiences along the creation process. The posts are not only about the Shiny app, but the package I created behind it, including topics of functions crafting, creation of the maps, classes of objects, etc., as well as any interesting issue that appear on the way. It is my way to contribute to the R community and at the same time keeping the project documented for myself.

This post is about Creating functions for ggplot.

I hope you all enjoy it. Feel free to leave any kind of comment and/or question at the end.

Background and preliminaries

In the first post we created a function to create the basic map. Since then I have modified the function slightly, but the concept is the same. You can see below the most up to date version and compare it with the previous version if you wish.

my_country_prev 300'), guide = guide_legend(label.position = 'bottom', label.vjust = 0, nrow = 1)), geom_point(data = filter(filt, n == 1), aes(x, y), color = map_colors$dots_orgs, shape = 19, size = 2.5) , theme(legend.position = 'bottom') ) return(map_points) }

As you can see, the function also requires our object map_colors, which we created before. Another way of passing values from a list is by defining these values directly within the function arguments, as we did here for column_names. We could pass the arguments directly when calling the function, or define them earlier to be used. Let’s use the second approach.

col_names = list(lat = "lat", lon = "lon", cities = "City", start_year = "Registration_year")

If you look at the data frame that we created containing the data, this are simply the names of the columns as we specified them.

Now, about the function itself, it starts, as expected, by calling the libraries and then doing a bit of error handling to ensure that the fields that are strictly required are actually present in the data frame. There I am also adding the options for the end_year which is used in case some franchise closed and we want to map it only for the period of time it was present.

Then we define the “Dots base size”. Here we experimented with so many sizes, both for the dots and for the final map, and this are the ones that look the best. Still, I’m allowing this value to be changed as the parameter dot_size in the function definition, however I wouldn’t recommend changing it. You can also play with the internal values and see it for yourself. Since the idea here is to create functions for the “standards” of the maps, allowing minimal changes, we are not so strict as per how big the dots should be, yet we have certain degree of control.

Then we do a little bit of data manipulation before being able to use the data. This includes the standardization of the names of Cities (up to some degree), filtering the data that does not match with the selected year, using only the median value of the latitude and longitude data, and defining the sizes of the dots according to the amount of franchises. The last one is a tricky one that I haven’t decided yet what amount of freedom should still be out there. Maybe there should be a separated function to define all that. Our maps were created to handle data containing from few hundreds of rows, to a couple of thousands, thus, the values presented here. But if you want to show just a few organizations (as is the case of this example), the map looks quite deserted; on the other hand, if you need to map values of thousands per city, the maps look overloaded. For the present post I’m keeping it as is, with a note for consideration. We also added one extra geom_point to overwrite the alpha value for the case of only 1, and make it solid. This also works well on the visuals.

In any case, the function above shows how we can manipulate the data inside a function, and return only what we need to add it to an existent ggplot. We can now add the dots as we would normally do in ggplot style.

my_country_prev("Mexico", map_colors, x_limits = c(-118, -86), y_limits = c(14, 34), show_coords = T) + make_dots(datmx, year = 2022, map_colors, column_names = col_names) + scale_x_continuous(n.breaks = 20) + ggtitle("A map of Mexico")

Adding labels for the map

Moving forward, we want to add some labels to the maps to know what we are seeing. Here I created one function to show which year is being mapped, and a second one to show the totals. Although we can achieve that easily in different ways, I managed to make it complicated, keeping in mind that we want to map any region in the world.

my_print_years

【本文地址】

公司简介

联系我们