Welcome

This is part 1 of the 2 part course from CDRC on the Access to Healthy Assets & Hazards (AHAH) dataset and creating Multi Dimensional Indices. The video in this part introduces the AHAH data set, and the practical session shows you how to work with the AHAH dataset in R. If you are completely new to RStudio, please check out our Short Course on Using R as a GIS.

After completing this material, you will:



Part 1: Access to Healthy Assets & Hazards (AHAH)

Downloading AHAH Data

Our first step is to download the AHAH dataset:

  • Open a web browser and go to https://data.cdrc.ac.uk

  • Register if you need to, or if you are already registered, make sure you are logged in.

  • Search for AHAH.

  • Open the Access to Healthy Assets & Hazards (AHAH) page.

  • Scroll down and choose the download option for the AHAH Overall Index and Domains

  • Save the ahahv2domainsindex.csv file to your working directory.

  • Open the file in Excel - what data do we have?

  • Check out the Metadata download if you need to.

Loading AHAH data in R

  • Start a new Script in RStudio.

  • Set your working directory.

  • Use this code to read in the file:

ahah <- read.csv("ahahv2domainsindex.csv")
  • Use head() to check what the data are:
head(ahah)
     lsoa11 r_rank h_rank g_rank e_rank    r_exp     h_exp     g_exp    e_exp
1 E01000001  40807   1745  26888  41705 77.27418 0.9695099 23.244684 99.01208
2 E01000002  40019   4168  27094  41685 67.39581 2.3873021 23.554669 98.22003
3 E01000003  40443   1829  12699  41715 72.18210 1.0172268  8.216157 99.41857
4 E01000005  41289   1525   8496  41570 86.42179 0.8450043  5.160087 94.12960
5 E01000006  37079  12284  18048  40870 48.21182 7.8956811 12.804410 78.28246
6 E01000007  38514   5043   6274  40790 55.64411 2.9215662  3.694852 77.00951
      ahah r_ahah d_ahah r_dec h_dec g_dec e_dec
1 50.12511  41713     10    10     1     7    10
2 47.88945  41647     10    10     1     7    10
3 45.20851  41419     10    10     1     4    10
4 46.63912  41564     10    10     1     3    10
5 36.79859  39464     10     9     3     5    10
6 34.81751  38622     10    10     2     2    10
  • Is this the data we expect to see?

  • Use View() to look at the data.

  • Use str() to see whether they are character or numeric variables.

str(ahah)
'data.frame':   41729 obs. of  16 variables:
 $ lsoa11: chr  "E01000001" "E01000002" "E01000003" "E01000005" ...
 $ r_rank: int  40807 40019 40443 41289 37079 38514 29858 41529 39878 39875 ...
 $ h_rank: int  1745 4168 1829 1525 12284 5043 11250 6793 936 11731 ...
 $ g_rank: int  26888 27094 12699 8496 18048 6274 888 6339 4658 1443 ...
 $ e_rank: int  41705 41685 41715 41570 40870 40790 40852 40576 40843 40614 ...
 $ r_exp : num  77.3 67.4 72.2 86.4 48.2 ...
 $ h_exp : num  0.97 2.387 1.017 0.845 7.896 ...
 $ g_exp : num  23.24 23.55 8.22 5.16 12.8 ...
 $ e_exp : num  99 98.2 99.4 94.1 78.3 ...
 $ ahah  : num  50.1 47.9 45.2 46.6 36.8 ...
 $ r_ahah: int  41713 41647 41419 41564 39464 38622 34155 41216 39446 39594 ...
 $ d_ahah: int  10 10 10 10 10 10 9 10 10 10 ...
 $ r_dec : int  10 10 10 10 9 10 8 10 10 10 ...
 $ h_dec : int  1 1 1 1 3 2 3 2 1 3 ...
 $ g_dec : int  7 7 4 3 5 2 1 2 2 1 ...
 $ e_dec : int  10 10 10 10 10 10 10 10 10 10 ...
  • Is this what you expect?

There will be character chr, integer int and numeric num values in this data frame. Make sure you can identify which is which, and that you know what the differences are.

  • How are the data distributed? (try hist()).

Loading Spatial Data

To create any maps, we need some spatial data.

  • Go to http://census.edina.ac.uk/ and select Boundary Data Selector.
  • Then set Country to England, Geography to Statistical Building Block, dates to 2011 and later, and click Find.
  • Select English Lower Layer Super Output Areas, 2011 and click List Areas.
  • Select Liverpool from the list and click Extract Boundary Data.
  • After a 5 to 20 second wait, click BoundaryData.zip to download the files.

Extract the files, and move all the files starting with the name england_lsoa_2011 to your working folder.

We will also need some spatial libraries:

#load libraries
library(sf)
library(tmap)

Read in the spatial data we downloaded from Edina:

#read in shapefile
LSOA <- st_read("england_lsoa_2011.shp")
  • Has the data been read in correctly? Try head(), class() and str().

Let’s do a quick map:

qtm(LSOA)

  • Is the data correct? Does it look the data we were expecting to see?

Joining Attribute and Spatial Data

Next step is to join the attribute data (ahah) to the spatial data (LSOA).

Use head() to check which columns we are using for the join:

#check which columns we are joining
head(ahah)
head(LSOA)
#join attribute data to LSOA
LSOA <- merge(LSOA, ahah, by.x="code", by.y="lsoa11")

#check output
head(LSOA)
Simple feature collection with 6 features and 18 fields
geometry type:  POLYGON
dimension:      XY
bbox:           xmin: 334715 ymin: 385417 xmax: 339020.8 ymax: 390548
projected CRS:  OSGB 1936 / British National Grid
       code                       label           name r_rank h_rank g_rank
1 E01006512 E08000012E02001377E01006512 Liverpool 031A  31239   3055  31351
2 E01006513 E08000012E02006932E01006513 Liverpool 060A  41456   2213  37636
3 E01006514 E08000012E02001383E01006514 Liverpool 037A  37258   4510  39228
4 E01006515 E08000012E02001383E01006515 Liverpool 037B  34254   6931  29900
5 E01006518 E08000012E02001390E01006518 Liverpool 044A  25433   7533  37244
6 E01006519 E08000012E02001402E01006519 Liverpool 056A  24866  23594  14398
  e_rank    r_exp     h_exp     g_exp    e_exp     ahah r_ahah d_ahah r_dec
1  35965 30.88848  1.725173 31.123073 43.74480 26.87038  32211      8     8
2  35878 90.68599  1.236631 50.819353 43.42989 46.54297  41551     10    10
3  35143 49.01800  2.594648 60.485044 40.92866 38.25659  39958     10     9
4  35065 38.22712  4.118458 28.254809 40.67851 27.81972  33425      9     9
5  34392 21.16668  4.513573 48.953916 38.62623 28.31510  34019      9     7
6  28722 20.40515 18.783273  9.576826 26.16356 18.73220  16649      4     6
  h_dec g_dec e_dec                       geometry
1     1     8     9 POLYGON ((336203 390010, 33...
2     1    10     9 POLYGON ((335402.8 390317.5...
3     2    10     9 POLYGON ((335651.3 389926.8...
4     2     8     9 POLYGON ((335186 389604, 33...
5     2     9     9 POLYGON ((335537.2 389034.5...
6     6     4     7 POLYGON ((338014.6 386447.2...

Finally, we can plot the maps quickly using qtm() from the tmap library:

#ahah index
qtm(LSOA, "ahah")

#ahah deciles
qtm(LSOA, "d_ahah")

This works well, and we can choose which variable to show. However we don’t get many options with this. We can use a different function tm_shape(), which will give us more options.

tm_shape(LSOA) +
tm_polygons("ahah")
tm_shape(LSOA) +
tm_polygons("ahah", title = "AHAH Index", palette = "Greens", style = "jenks") +
tm_layout(legend.title.size = 0.8)

This allows us to change the title, colours and legend title size. Try substituting in Blues and adjusting the title.

  • Why are the different colours important?
  • Which classification method should we use?

*If you are not sure about classification, look at Part 2: Mapping Spatial Data of the Short Course on Using R as a GIS.

We have the data for AHAH, but we also have the domain data as well. Can you plot a map for one of the domains?

Exercise

By now you should have a R script with all the steps to:

  • read in the AHAH data
  • read in the spatial data (for whatever area you want to show)
  • join the two data sets
  • plot a map

Try creating a map for a different area of the UK, using the script you have.

Additonal Exercises

Routing in Routino

Routino is one of the key pieces of software used to create the data used in the AHAH index. You can see how the routing is done using the example tool on the website.

  • Go to http://www.routino.org/
  • Click Calculate Route (UK) on the top left
  • Click on the Leaflet Version map
  • Under 1 under Waypoints, type in a location or postcode and press enter
  • A list may appear beneath - select the location from the list
  • Repeat the process for 2 (destination)
  • Choose an appropriate transport type (Motorcar is default)
  • Click on Shortest Route
  • It will show Router Running. It may take 10 seconds or so.
  • It will then show the route (you may need to adjust the map).

Example of routing in Routino

Working with loops is key for the creation of this dataset.

Loops

Loops are a great computer programming concept we can use. Working from the example above, and the loop code from the Short Course on Using R as a GIS, can you write a loop to create a map for each domain of the AHAH index?

Mini R Course loop code:

#set which variables will be mapped
  mapvariables <- c("AllUsualResidents", "Age00to04", "Age05to07")

#loop through for each map
  for (i in 1:length(mapvariables)) {
  #setup map
    m <- tm_shape(LSOA) +
      #set variable, colours and classes
      tm_polygons(mapvariables[i], palette = "Greens", style = "equal") +
      #set scale bar
      tm_scale_bar(width = 0.22, position = c(0.05, 0.18)) +
      #set compass
      tm_compass(position = c(0.3, 0.07)) + 
      #set layout
      tm_layout(frame = F, title = "Liverpool", title.size = 2, 
                title.position = c(0.7, "top"))
    #save map
    tmap_save(m, filename = paste0("map-",mapvariables[i],".png"))
  #end loop
  }

This practical was written using R 3.5.1 (2018-07-02) and RStudio 1.1.463, and tested on R 4.0.2 (2020-06-22) and RStudio Cloud (RStudio Server Pro Version 1.3.1056-1) by Dr. Nick Bearman ().

This work is licensed under the Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/. The latest version of the workbook is available from https://data.cdrc.ac.uk/dataset/advanced-gis-methods-training-ahah-and-multi-dimensional-indices and https://github.com/nickbearman/cdrc-ahah-mdi-course. This version was created on 03 November 2020.