This is part 1 of the 2 part course from CDRC on the Access to Healthy Assets & Hazards (AHAH) dataset and creating Multi Dimensional Indices. The video in this part introduces the AHAH data set, and the practical session shows you how to work with the AHAH dataset in R. If you are completely new to RStudio, please check out our Short Course on Using R as a GIS.
After completing this material, you will:
Our first step is to download the AHAH dataset:
Open a web browser and go to https://data.cdrc.ac.uk
Register if you need to, or if you are already registered, make sure you are logged in.
Search for AHAH.
Open the Access to Healthy Assets & Hazards (AHAH) page.
Save the ahahv2domainsindex.csv
file to your working directory.
Open the file in Excel - what data do we have?
Check out the Metadata download if you need to.
Start a new Script in RStudio.
Set your working directory.
Use this code to read in the file:
ahah <- read.csv("ahahv2domainsindex.csv")
head()
to check what the data are:head(ahah)
lsoa11 r_rank h_rank g_rank e_rank r_exp h_exp g_exp e_exp
1 E01000001 40807 1745 26888 41705 77.27418 0.9695099 23.244684 99.01208
2 E01000002 40019 4168 27094 41685 67.39581 2.3873021 23.554669 98.22003
3 E01000003 40443 1829 12699 41715 72.18210 1.0172268 8.216157 99.41857
4 E01000005 41289 1525 8496 41570 86.42179 0.8450043 5.160087 94.12960
5 E01000006 37079 12284 18048 40870 48.21182 7.8956811 12.804410 78.28246
6 E01000007 38514 5043 6274 40790 55.64411 2.9215662 3.694852 77.00951
ahah r_ahah d_ahah r_dec h_dec g_dec e_dec
1 50.12511 41713 10 10 1 7 10
2 47.88945 41647 10 10 1 7 10
3 45.20851 41419 10 10 1 4 10
4 46.63912 41564 10 10 1 3 10
5 36.79859 39464 10 9 3 5 10
6 34.81751 38622 10 10 2 2 10
Is this the data we expect to see?
Use View()
to look at the data.
Use str()
to see whether they are character or numeric variables.
str(ahah)
'data.frame': 41729 obs. of 16 variables:
$ lsoa11: chr "E01000001" "E01000002" "E01000003" "E01000005" ...
$ r_rank: int 40807 40019 40443 41289 37079 38514 29858 41529 39878 39875 ...
$ h_rank: int 1745 4168 1829 1525 12284 5043 11250 6793 936 11731 ...
$ g_rank: int 26888 27094 12699 8496 18048 6274 888 6339 4658 1443 ...
$ e_rank: int 41705 41685 41715 41570 40870 40790 40852 40576 40843 40614 ...
$ r_exp : num 77.3 67.4 72.2 86.4 48.2 ...
$ h_exp : num 0.97 2.387 1.017 0.845 7.896 ...
$ g_exp : num 23.24 23.55 8.22 5.16 12.8 ...
$ e_exp : num 99 98.2 99.4 94.1 78.3 ...
$ ahah : num 50.1 47.9 45.2 46.6 36.8 ...
$ r_ahah: int 41713 41647 41419 41564 39464 38622 34155 41216 39446 39594 ...
$ d_ahah: int 10 10 10 10 10 10 9 10 10 10 ...
$ r_dec : int 10 10 10 10 9 10 8 10 10 10 ...
$ h_dec : int 1 1 1 1 3 2 3 2 1 3 ...
$ g_dec : int 7 7 4 3 5 2 1 2 2 1 ...
$ e_dec : int 10 10 10 10 10 10 10 10 10 10 ...
There will be character chr
, integer int
and numeric num
values in this data frame. Make sure you can identify which is which, and that you know what the differences are.
hist()
).To create any maps, we need some spatial data.
BoundaryData.zip
to download the files.Extract the files, and move all the files starting with the name england_lsoa_2011
to your working folder.
We will also need some spatial libraries:
#load libraries
library(sf)
library(tmap)
Read in the spatial data we downloaded from Edina:
#read in shapefile
LSOA <- st_read("england_lsoa_2011.shp")
head()
, class()
and str()
.Let’s do a quick map:
qtm(LSOA)
Next step is to join the attribute data (ahah
) to the spatial data (LSOA
).
Use head()
to check which columns we are using for the join:
#check which columns we are joining
head(ahah)
head(LSOA)
#join attribute data to LSOA
LSOA <- merge(LSOA, ahah, by.x="code", by.y="lsoa11")
#check output
head(LSOA)
Simple feature collection with 6 features and 18 fields
geometry type: POLYGON
dimension: XY
bbox: xmin: 334715 ymin: 385417 xmax: 339020.8 ymax: 390548
projected CRS: OSGB 1936 / British National Grid
code label name r_rank h_rank g_rank
1 E01006512 E08000012E02001377E01006512 Liverpool 031A 31239 3055 31351
2 E01006513 E08000012E02006932E01006513 Liverpool 060A 41456 2213 37636
3 E01006514 E08000012E02001383E01006514 Liverpool 037A 37258 4510 39228
4 E01006515 E08000012E02001383E01006515 Liverpool 037B 34254 6931 29900
5 E01006518 E08000012E02001390E01006518 Liverpool 044A 25433 7533 37244
6 E01006519 E08000012E02001402E01006519 Liverpool 056A 24866 23594 14398
e_rank r_exp h_exp g_exp e_exp ahah r_ahah d_ahah r_dec
1 35965 30.88848 1.725173 31.123073 43.74480 26.87038 32211 8 8
2 35878 90.68599 1.236631 50.819353 43.42989 46.54297 41551 10 10
3 35143 49.01800 2.594648 60.485044 40.92866 38.25659 39958 10 9
4 35065 38.22712 4.118458 28.254809 40.67851 27.81972 33425 9 9
5 34392 21.16668 4.513573 48.953916 38.62623 28.31510 34019 9 7
6 28722 20.40515 18.783273 9.576826 26.16356 18.73220 16649 4 6
h_dec g_dec e_dec geometry
1 1 8 9 POLYGON ((336203 390010, 33...
2 1 10 9 POLYGON ((335402.8 390317.5...
3 2 10 9 POLYGON ((335651.3 389926.8...
4 2 8 9 POLYGON ((335186 389604, 33...
5 2 9 9 POLYGON ((335537.2 389034.5...
6 6 4 7 POLYGON ((338014.6 386447.2...
Finally, we can plot the maps quickly using qtm()
from the tmap
library:
#ahah index
qtm(LSOA, "ahah")
#ahah deciles
qtm(LSOA, "d_ahah")
This works well, and we can choose which variable to show. However we don’t get many options with this. We can use a different function tm_shape()
, which will give us more options.
tm_shape(LSOA) +
tm_polygons("ahah")
tm_shape(LSOA) +
tm_polygons("ahah", title = "AHAH Index", palette = "Greens", style = "jenks") +
tm_layout(legend.title.size = 0.8)
This allows us to change the title, colours and legend title size. Try substituting in Blues
and adjusting the title.
*If you are not sure about classification, look at Part 2: Mapping Spatial Data of the Short Course on Using R as a GIS.
We have the data for AHAH, but we also have the domain data as well. Can you plot a map for one of the domains?
By now you should have a R script with all the steps to:
Try creating a map for a different area of the UK, using the script you have.
Routino is one of the key pieces of software used to create the data used in the AHAH index. You can see how the routing is done using the example tool on the website.
Example of routing in Routino
Working with loops is key for the creation of this dataset.
Loops are a great computer programming concept we can use. Working from the example above, and the loop code from the Short Course on Using R as a GIS, can you write a loop to create a map for each domain of the AHAH index?
Mini R Course loop code:
#set which variables will be mapped
mapvariables <- c("AllUsualResidents", "Age00to04", "Age05to07")
#loop through for each map
for (i in 1:length(mapvariables)) {
#setup map
m <- tm_shape(LSOA) +
#set variable, colours and classes
tm_polygons(mapvariables[i], palette = "Greens", style = "equal") +
#set scale bar
tm_scale_bar(width = 0.22, position = c(0.05, 0.18)) +
#set compass
tm_compass(position = c(0.3, 0.07)) +
#set layout
tm_layout(frame = F, title = "Liverpool", title.size = 2,
title.position = c(0.7, "top"))
#save map
tmap_save(m, filename = paste0("map-",mapvariables[i],".png"))
#end loop
}
This practical was written using R 3.5.1 (2018-07-02) and RStudio 1.1.463, and tested on R 4.0.2 (2020-06-22) and RStudio Cloud (RStudio Server Pro Version 1.3.1056-1) by Dr. Nick Bearman (nick@geospatialtrainingsolutions.co.uk).
This work is licensed under the Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/. The latest version of the workbook is available from https://data.cdrc.ac.uk/dataset/advanced-gis-methods-training-ahah-and-multi-dimensional-indices and https://github.com/nickbearman/cdrc-ahah-mdi-course. This version was created on 03 November 2020.