Retail Centres and Catchment Areas

Part 2: Retail Catchment Areas

Welcome Back!

This is part 2 of the 2 part course from CDRC on using the UK Retail Centres dataset to create retail catchments. This practical session shows you how to estimate retail catchments using a gravity or spatial interaction model (SIM), more specifically a probabilistic SIM called the Huff model. Make sure you have completed Part 1 of this Training Course before beginning this section.

After completing this material, you will:

Understand what a Huff model is, and what its basic components are
Know how to build a Huff model for Retail Centres in the Liverpool City Region
Know how to delineate catchments for each Retail Centre using the Huff model

The following video will provide an introduction to Retail Catchment Areas.

Setup

First we want to set up the libraries that we are going to be using for this practical - you should have these installed from the previous practical:

library(sf)
library(dplyr)
library(tmap)
tmap_mode("view") ## Interactive Mapping

We also need some additional packages for computation of the Huff Model, so go ahead and install the following packages if you don’t have them already:

#install.packages("rgdal")
library(rgdal)
library(rgeos)
library(igraph)
library(FNN)

Make sure you have set your working directory to wherever you have stored the CDRC-Retail-Catchment-Training folder we have provided:

#setwd("CDRC-Retail-Catchment-Training")

Finally, we need to download some functions that we need for the Huff model. The functions needed have been packaged in a .R file called huff-tools.R, which you can find in the ‘Scripts’ folder provided for this course. The file contains 9 functions, two of which we will be using in this practical. To read the file and save the functions to your environment, run the following line of code:

## Save huff-tools.R functions to memory
source("Scripts/huff-tools.R")

You will notice you now have 9 new functions in your environment - the primary ones we will be using are huff_basic() and select_by_probs(). For more information on these functions visit the huff-tools.R vignette.

1. Data (& Preprocessing)

Again, we are going to be working with the Liverpool City Region Retail Centres dataset. Specifically we are going to work with the set of Retail Centres we created the hierarchy for in the last practical. So go ahead and read those back into your environment:

## Read in the LCR Retail Centres (with Hierarchy from Part 1)
rc <- st_read("Data/Part2_Retail_Centres.gpkg")

## Reading layer `Part2_Retail_Centres' from data source `C:\Users\nick\Dropbox\Work\2021-27-cdrc-les-dolega-retail-catchments-training\CDRC-Retail-Catchment-Training\Data\Part2_Retail_Centres.gpkg' using driver `GPKG'
## Simple feature collection with 20 features and 6 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 321296 ymin: 381776.6 xmax: 351923.6 ymax: 417406.3
## projected CRS:  OSGB 1936 / British National Grid

If you don’t have this file or had forgotten to read it out at the end of the practical, go back and complete Section 2 of Part 1 to obtain the hierarchy, and the skip to the end of the practical to write the file out (st_write).

We will also be needing an empty set of LSOA’s (Lower Super Output Areas) for the Liverpool City Region. For those unfamiliar to LSOAs, they are a set of geographical areas developed following the 2011 census, containing on average 1,500 people in each LSOA. The LSOAs you need for this practical can be found in the ‘Data’ folder, so go ahead and read these in:

## Read in the Liverpool City Region LSOAs
lsoa <- st_read("Data/LCR_LSOA.gpkg")

## Reading layer `LCR_lsoa' from data source `C:\Users\nick\Dropbox\Work\2021-27-cdrc-les-dolega-retail-catchments-training\CDRC-Retail-Catchment-Training\Data\LCR_LSOA.gpkg' using driver `GPKG'
## Simple feature collection with 989 features and 1 field
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 318351.7 ymin: 378277.6 xmax: 361798.8 ymax: 422878.7
## projected CRS:  Transverse_Mercator

For those unfamiliar to LSOA’s, a quick plot will show you roughly what they look like:

## Plot the LSOAs for Liverpool City Region 
tm_shape(lsoa) +
  tm_fill(alpha = 0.75) +
  tm_borders(col = "black", lwd = 0.25, alpha = 0.5)

The final dataset we need for this practical is a file called ‘Distances.csv’, which you can also find in the ‘Data’ folder. The data is non-spatial (.csv) so we can just use the read.csv() function to read into memory

## Read in Distances.csv
distances <- read.csv("Data/Distances.csv")

Let’s take a look at distances:

## Use head() to print the first few rows of any data frame or tabular dataset
head(distances)

##         rcID  lsoa11cd distance
## 1 RC_EW_2713 E01006412    17012
## 2 RC_EW_2713 E01006413    17162
## 3 RC_EW_2713 E01006414    16856
## 4 RC_EW_2713 E01006415    16782
## 5 RC_EW_2713 E01006416    18431
## 6 RC_EW_2713 E01006417    16598

So the data contains three columns:

rcID - Retail Centre IDs (we used these in Part 1).
lsoa11cd - LSOA Codes; a unique code for each LSOA in the Liverpool City Region.
distance - distances (in metres) from each Retail Centre to each LSOA (centroid) in Liverpool City Region.

Notice how the first few rows have the same rcID value, as each row contains a distance to a different LSOA. These distances have been precomputed for you and packaged in the Distances.csv file, however if you are interested in how to calculate your own set of point to point distances, please visit the documentation for the route_matrix() function from the hereR package (https://www.rdocumentation.org/packages/hereR/versions/0.3.0/topics/route_matrix).

The final step we want to take is to join these distances to the Retail Centre data (rc) we have in our environment. I am going to use pipes in the next step to perform an inner join between the Retail Centre data (rc) and the object containing the rc-LSOA distances (distances). An inner_join keeps all the rows of both objects, looking for a unique column to join on between the two. In this instance, the inner_join will be performed using the rcID column as both rc and distance variables have that column in common. Following this, you will notice I am converting the data to a data frame and selecting only the columns I am interested in keeping:

## Joining rc and distances together, and then tidy up the object (data.frame, selecting columns)
huff_input <- rc %>%
  inner_join(distances) %>%
  as.data.frame() %>%
  select(rcID, rcName,n.comp.units, RetailPark, hierarchy, lsoa11cd, distance)

Note: if you would rather not use pipes, you can deconstruct the above code as below and obtain the same output:

## Joing rc and distances together, and then tidy up the object (data.frame, selecting columns) NO PIPES
# huff_input <- inner_join(rc, distances)
# huff_input <- as.data.frame(huff_input)
# huff_input <- select(huff_input, rcID, rcName, n.comp.units, hierarchy, lsoa11cd, distance)

Now that we have joined the Retail Centre data and distances together, let’s inspect the output:

## use head() to print the first few rows of any data frame or tabular dataset
head(huff_input)

##         rcID         rcName n.comp.units RetailPark hierarchy  lsoa11cd
## 1 RC_EW_2751 Liverpool City          257          N   primary E01006412
## 2 RC_EW_2751 Liverpool City          257          N   primary E01006413
## 3 RC_EW_2751 Liverpool City          257          N   primary E01006414
## 4 RC_EW_2751 Liverpool City          257          N   primary E01006415
## 5 RC_EW_2751 Liverpool City          257          N   primary E01006416
## 6 RC_EW_2751 Liverpool City          257          N   primary E01006417
##   distance
## 1    10563
## 2    10713
## 3    10407
## 4    12997
## 5    14708
## 6    12813

So the dataset contains:

rcID - Retail Centre IDs
rcName - Retail Centre names
n.comp.units - a count of the number of comparison retail units in each Retail Centre (we used this in Part 1)
RetailPark - a Y/N identifier if the Retail Centre is a retail park or not
hierarchy - a hierarchical value for each Retail Centre (primary, secondary, tertiary) based on the number of units in each centre (we calculated this in Part 1)
lsoa11cd - LSOA codes
distance - distances (in metres) from each Retail Centre to each LSOA in Liverpool City Region

Ok, you’re all set!

2. Setting up the Huff Model

The Huff model was first proposed in 1964 by Huff, and is calibrated using three main variables - distance, attractiveness and competition - and is one of the underpinning theoretical models in Retail Geography. The Huff model was used in a paper by Dolega et al. (2016) to delineate catchments for a set of Retail Centres for the UK, and the approach we use in this next section uses the same set of tools the authors used in that paper.

For the Huff model to run (using the huff_basic function), we need as input:

A list of unique names for the destination locations.
A list of attractiveness score for the destination locations.
A list of unique names for the origin locations.
A list of pairwise distances between origins and destinations.
A beta exponent of distance [optional].
An alpha exponent of the attractiveness score [optional].

We have the names for destination locations (Retail Centre IDs) and the names for the origin locations (lsoa11cd), as well as the pairwise distances between the origins and destinations.

But we do not have:

Attractiveness scores for the destination locations (Retail Centres)
Beta exponent for distance
Alpha exponent for attractiveness

2a. Attractiveness Scores

In this practical we will use the total number of comparison units as the measure of attractiveness. This is not ideal as usually this would be a pre-computed value based on a number of other parameters, however for the purposes of defining attractiveness, the total number of comparison retailers is arguably an appropriate substitute.

However, we need to make one alteration to these scores; some of the secondary centres are large Retail Parks but typically have a similar number of comparison units to other smaller shopping centres. However, these Retail Parks are typically much more attractive than those, so we need to integrate a slightly different attractiveness score for the Retail Parks.

In this next line of code we use mutate and ifelse statements again to create a new column: attr_scores, where we set the attractiveness score of Retail Centres that are not Retail Parks as the number of comparison units, and double the number of comparison units to get a more representative attractiveness score for the Retail Parks. For more information on the structure of ifelse statements using base R, visit: https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/ifelse.

## Create attractiveness score column and modify based on whether Retail Centre is in a Retail Park or not
huff_input <- huff_input %>%
  mutate(attr_score = n.comp.units) %>%
  mutate(attr_score, ifelse(RetailPark == "N", attr_score,
                            ifelse(RetailPark == "Y", attr_score * 2, 0))) %>%
  select(-c(attr_score)) %>%
  rename(attr_score = 8)

2b. Beta Exponent

The Beta parameter (or distance decay exponent) is a vital component of the Huff model. It is used to account for the tendency for Retail Centres at the top of the hierarchy (‘primary’ centres) having a lower distance decay - i.e. their attractiveness is still reduced with distance, but at a slower rate than the smaller centres (‘secondary’ and ‘tertiary’). Typically this would be estimated by a statistical model.

For now, we are going to use the default value (2), but bear in mind that this can be tweaked to modify the distance decay component, and even be used to apply different beta values depending on hierarchy.

Let’s go ahead and set the beta value as the default:

## Create column called beta, where the value = 2 for each Retail Centre
huff_input$beta <- 2

2c. Alpha Exponent

In this practical we are going to use the default alpha value of 1, to control the alpha exponent of the attractiveness score. So go ahead and set this:

## Create column called alpha, where the value = 1 for each Retail Centre
huff_input$alpha <- 1

3. Running the Huff Model - Calculating Huff Probabilities

Let’s take a look at our dataset, to make sure we have all the necessary inputs for the huff_basic() function:

## head() displays the first few rows of any data frame or tabular object
head(huff_input)

##         rcID         rcName n.comp.units RetailPark hierarchy  lsoa11cd
## 1 RC_EW_2751 Liverpool City          257          N   primary E01006412
## 2 RC_EW_2751 Liverpool City          257          N   primary E01006413
## 3 RC_EW_2751 Liverpool City          257          N   primary E01006414
## 4 RC_EW_2751 Liverpool City          257          N   primary E01006415
## 5 RC_EW_2751 Liverpool City          257          N   primary E01006416
## 6 RC_EW_2751 Liverpool City          257          N   primary E01006417
##   distance attr_score beta alpha
## 1    10563        257    2     1
## 2    10713        257    2     1
## 3    10407        257    2     1
## 4    12997        257    2     1
## 5    14708        257    2     1
## 6    12813        257    2     1

Let’s extract only the columns we need, in this case it is:

rcName - A list of unique names for the destination locations - Retail Centre Names
attr_score - A list of attractiveness score for the destination locations, based on n.comp.units and whether the rc is a retail park or not
lsoa11cd - A list of unique names for the origin locations - LSOA codes
distance - A list of pairwise distances between origins and destinations
alpha - alpha exponent of the attractiveness score
beta - distance decay exponent

Let’s grab these columns:

## Select the columns we need for the Huff Model
huff_default <- huff_input %>%
  select(rcName, attr_score, lsoa11cd, distance, alpha, beta)

3a. Huff Model (Default Parameters)

## Fit the huff model, setting alpha to 1 and beta to 2 - these are the default parameters
huff_probs <- huff_basic(destinations_name = huff_default$rcName,
                         destinations_attractiveness = huff_default$attr_score,
                         origins_name = huff_default$lsoa11cd,
                         distance = huff_default$distance,
                         alpha = huff_default$alpha,
                         beta = huff_default$beta)

That’s it - you’ve successfully run the Huff Model! Let’s take a look at the output:

## head() displays the first few rows of any data frame or tabular object
head(huff_probs)

##   origins_name destinations_name distance         huff     sum_huff
## 1    E01006412    Liverpool City    10563 2.303343e-06 8.692141e-06
## 2    E01006413    Liverpool City    10713 2.239293e-06 8.859414e-06
## 3    E01006414    Liverpool City    10407 2.372914e-06 1.002442e-05
## 4    E01006415    Liverpool City    12997 1.521412e-06 5.951145e-06
## 5    E01006416    Liverpool City    14708 1.188026e-06 5.181425e-06
## 6    E01006417    Liverpool City    12813 1.565422e-06 6.138811e-06
##   huff_probability
## 1        0.2649914
## 2        0.2527586
## 3        0.2367133
## 4        0.2556503
## 5        0.2292855
## 6        0.2550041

Ok, so we have lots of original information (origins_name, destinations_name and distance), but also some new columns. The column we are interested in is the huff_probability column, which contains patronage probabilities for each Retail Centre and every LSOA in the dataset.

We want to extract the highest huff probability for each LSOA, in effect allocating that LSOA to the Retail Centre’s catchment (Note: each LSOA has a probability for each Retail Centres, as in the real world, catchments can largely overlap). To do this, we use the select_by_probs() function in your environment from the huff-tools.R file. The function takes as input the output of the huff model, and a value (x) that selects the top x amount of huff probabilities for each LSOA. In this case we are just going to set this to one, to extract the top huff probability for each LSOA.

## Extract the highest huff probability in each LSOA
top_probs <- select_by_probs(huff_probs, 1)

Let’s take a look:

## head displays the top few rows of a dataframe or any tabular object
head(top_probs)

##   origins_name        destinations_name distance huff_probability
## 1    E01007198               West Kirby      107        0.9992933
## 2    E01007128               Birkenhead      150        0.9973848
## 3    E01033094                Southport      709        0.9970634
## 4    E01006743 New Mersey Shopping Park      242        0.9951654
## 5    E01007217                 Wallasey      156        0.9947344
## 6    E01006968                Southport     1002        0.9939574

So, the top_probs object contains the highest huff probabilities for each LSOA, sorted (in descending order) by huff probability. Notice how each huff probability also corresponds to a destinations_name or Retail Centre - we can therefore interpret this probability to be the probability someone from that LSOA (e.g. E01007198) will patronise that Retail Centre (e.g. West Kirby).

However, we also want to remove any LSOAs with low huff probabilities, as otherwise there will be some unusual results, such as LSOAs in South Wirral assigned to the Liverpool City Retail Centre. You can do this below with the filter() function. We create two objects below, one with only huff probabilities over 60% (top_probs_60) and one with probabilities over 40% (top_probs_40), so as to illustrate the differences between the two:

## Remove LSOAs with huff probs < 60%
top_probs_60 <- filter(top_probs, huff_probability >= 0.6)

top_probs_40 <- filter(top_probs, huff_probability >= 0.4)

3b. Huff Model (specification of alpha and beta)

We can also apply different beta values to control the impact of the beta exponent at different levels of the hierarchy. The values can be altered depending on the requirements, estimation technique etc. The values we use in this next section were found to produce the most appropriate catchment areas for the national level. These, of course, may vary for the city level for a number of reasons: a) the number of competitors is smaller within a city, b) there is a violation of a boundary free modelling, and c) the retail hierarchy within a single city may vary from the national one.

So let’s assign some beta values to each of the Retail Centre hierarchies:

## Create beta parameters, using mutate() and case_when()
huff_input <- huff_input %>%
  dplyr::mutate(beta = dplyr::case_when(hierarchy == "primary" ~ 1.9,
                                          hierarchy == "secondary" ~ 2.0,
                                            hierarchy == "tertiary" ~ 2.1))

If you don’t want to use pipes, you can run this line of code instead:

## Create beta parameters, using mutate() and case_when() NO PIPES
# huff_input <- mutate(huff_input, beta = case_when(hierarchy == "primary" ~ 1.9,
#                                           hierarchy == "secondary" ~ 2.0,
#                                             hierarchy == "tertiary" ~ 2.1) )

Check that the beta components have been assigned:

## use head() to print the first few rows of any data frame or tabular dataset
head(huff_input)

##         rcID         rcName n.comp.units RetailPark hierarchy  lsoa11cd
## 1 RC_EW_2751 Liverpool City          257          N   primary E01006412
## 2 RC_EW_2751 Liverpool City          257          N   primary E01006413
## 3 RC_EW_2751 Liverpool City          257          N   primary E01006414
## 4 RC_EW_2751 Liverpool City          257          N   primary E01006415
## 5 RC_EW_2751 Liverpool City          257          N   primary E01006416
## 6 RC_EW_2751 Liverpool City          257          N   primary E01006417
##   distance attr_score beta alpha
## 1    10563        257  1.9     1
## 2    10713        257  1.9     1
## 3    10407        257  1.9     1
## 4    12997        257  1.9     1
## 5    14708        257  1.9     1
## 6    12813        257  1.9     1

Now that we have specified the beta values we want to use, we can run the huff model - still using the default alpha parameter - but this time with the beta values that reflect the hierarchical position of the Retail Centre.

## Fit the huff model, setting alpha to 1 and beta the the beta column, containing the values we set earlier
hierarchy_huff_probs <- huff_basic(destinations_name = huff_input$rcName,
                         destinations_attractiveness = huff_input$attr_score,
                         origins_name = huff_input$lsoa11cd,
                         distance = huff_input$distance,
                         alpha = huff_input$alpha,
                         beta = huff_input$beta)

## Extract the highest huff probability in each LSOA
hierarchy_top_probs <- select_by_probs(hierarchy_huff_probs, 1)

## Remove those with < 60% huff probability
hierarchy_top_probs_60 <- filter(hierarchy_top_probs, huff_probability >= 0.6)
## Remove those with < 40% huff probability
hierarchy_top_probs_40 <- filter(hierarchy_top_probs, huff_probability >= 0.4)

Inspect the output:

# head() displays the first few rows of a data frame or tabular object
head(hierarchy_top_probs_60)

##   origins_name destinations_name distance huff_probability
## 1    E01007198        West Kirby      107        0.9986213
## 2    E01033094         Southport      709        0.9982509
## 3    E01006968         Southport     1002        0.9965177
## 4    E01007128        Birkenhead      150        0.9960710
## 5    E01033097         Southport     1099        0.9960153
## 6    E01033756    Liverpool City      551        0.9957745

4. Mapping the Output

Now that we have our results for the two huff models, we can convert these to a spatial format. First, let’s tidy up the top_probs object by changing some of the column names and dropping the distance column:

## Tidy up top_probs 60 & 40
top_probs_60 <- top_probs_60 %>%
  rename(lsoa11cd = origins_name, rcName = destinations_name) %>%
  select(lsoa11cd, rcName, huff_probability)
top_probs_40 <- top_probs_40 %>%
  rename(lsoa11cd = origins_name, rcName = destinations_name) %>%
  select(lsoa11cd, rcName, huff_probability)

And the same for the Huff model accounting for hierarchy:

## Tidy up the output of the huff model using different beta values for hierarchy PIPED
hierarchy_top_probs_60 <- hierarchy_top_probs_60%>%
  rename(lsoa11cd = origins_name, rcName = destinations_name) %>%
  select(lsoa11cd, rcName, huff_probability)
hierarchy_top_probs_40 <- hierarchy_top_probs_40%>%
  rename(lsoa11cd = origins_name, rcName = destinations_name) %>%
  select(lsoa11cd, rcName, huff_probability)

Now we are ready to convert it to a spatial format, we are going to do this by joining the top_probs/hierarchy_top_probs objects onto the empty lsoa sf we read in at the start of the practical. The lsoa object contains a column with the LSOA codes lsoa11cd, and so do the top_probs/hierarchy_top_probs objects, and they both have 989 rows, so we can perform a simple merge to join the two together:

## Merge the results of the huff model with default parameters onto the LSOAs - 40 & 60%
lcr_huff_60 <- merge(lsoa, top_probs_60, by = "lsoa11cd")
lcr_huff_40 <- merge(lsoa, top_probs_40, by = "lsoa11cd")

## Merge on the results of the huff model accounting for hierarchy with beta values onto the LSOAs - 40 & 60%
lcr_hierarchy_huff_60 <- merge(lsoa, hierarchy_top_probs_60, by = "lsoa11cd")
lcr_hierarchy_huff_40 <- merge(lsoa, hierarchy_top_probs_40, by = "lsoa11cd")

Now we have two sf objects of LSOAs for Liverpool City Region, containing the huff probabilities to the Retail Centre most likely patronised from each LSOA, the first based on a model using default parameters and the second using differing beta values to account for hierarchy. Let’s take a look:

## Huff probabilities (default parameters)
head(lcr_huff_60)

## Simple feature collection with 6 features and 3 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 342962 ymin: 390236.8 xmax: 346055.5 ymax: 392660.3
## projected CRS:  Transverse_Mercator
##    lsoa11cd rcName huff_probability                       geometry
## 1 E01006443 Huyton        0.8489124 MULTIPOLYGON (((343879.5 39...
## 2 E01006451 Huyton        0.6385644 MULTIPOLYGON (((343057.1 39...
## 3 E01006452 Huyton        0.6155942 MULTIPOLYGON (((343538.8 39...
## 4 E01006465 Huyton        0.6366054 MULTIPOLYGON (((344845.2 39...
## 5 E01006474 Huyton        0.7501580 MULTIPOLYGON (((343694.4 39...
## 6 E01006477 Huyton        0.6198132 MULTIPOLYGON (((345436 3907...

## Huff probabilities (beta values set to reflect hierarchy)
head(lcr_hierarchy_huff_60)

## Simple feature collection with 6 features and 3 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 335045.4 ymin: 389126 xmax: 345603.9 ymax: 391977.7
## projected CRS:  Transverse_Mercator
##    lsoa11cd         rcName huff_probability                       geometry
## 1 E01006443         Huyton        0.6891300 MULTIPOLYGON (((343879.5 39...
## 2 E01006481         Huyton        0.7544168 MULTIPOLYGON (((344889.1 39...
## 3 E01006482         Huyton        0.7485383 MULTIPOLYGON (((344373 3915...
## 4 E01006512 Liverpool City        0.9477617 MULTIPOLYGON (((336103.4 38...
## 5 E01006513 Liverpool City        0.9797344 MULTIPOLYGON (((335173.8 38...
## 6 E01006514 Liverpool City        0.9674695 MULTIPOLYGON (((335495.7 38...

We can map the output, using the rcName column in the tm_fill() argument to show us which LSOAs have been allocated to which Retail Centres - first lets do it for the Huff model using default parameters.

NOTE: As we are selecting huff probabilities over 60% or more, some LSOAs will be missing from the map; these are the LSOAs containing huff probabilities less than 60%.

## Map the allocation of LSOAs to Retail Centres - notice the additional arguments in tm_layout() to move the legend outside the map frame, and tm_borders() to show LSOA boundaries clearly
## Plot the centroids over these too
tm_shape(lcr_huff_60) +
  tm_fill(col = "rcName", alpha = 0.4) +
  tm_borders(col = "black", lwd = 0.25, alpha = 0.2) +
  tm_layout(legend.outside = TRUE, legend.outside.position = "right") +
  tm_shape(rc) +
  tm_dots(col = "orange", size = 0.05) +
  tm_text("rcName")

What is evident from this map is that the larger and more attractive Retail Centres (e.g Liverpool City, Southport), have a greater number of LSOAs assigned. Whereas smaller ‘tertiary’ centres like Bootle and Allerton Road have much smaller huff catchments.

We can now also visualise the results of the huff model that used different beta values for the hierarchies:

## Map the allocation of LSOAs to Retail Centres - notice the additional arguments in tm_layout() to move the legend outside the map frame, and tm_borders() to show LSOA boundaries clearly
tm_shape(lcr_hierarchy_huff_60) +
  tm_fill(col = "rcName", alpha = 0.4) +
  tm_borders(col = "black", lwd = 0.25, alpha = 0.2) +
  tm_layout(legend.outside = TRUE, legend.outside.position = "right") +
  tm_shape(rc) +
  tm_dots(col = "orange", size = 0.05) +
  tm_text("rcName")

4.1 Plot 40% maps (optional exercise)

Using the huff model outputs where we select probabilities over 40% (lcr_huff_40, lcr_hierarchy_huff_40), produce maps to compare the allocation of LSOAs to retail centres when huff probabilities over 40% are retained, in comparison to those over 60%.

5. Extracting Huff Catchments

The final part of this practical is using the huff probabilities and allocation of LSOAs to the Retail Centres to generate Huff catchments for each Retail Centre. We are now just going to work with the results of the Huff model that used the default alpha and beta values (lcr_huff), but feel free to run through this next section with the huff model using different beta/alpha values to see the differences.

In this first part, we show how this can be done for one Retail Centre - Huyton. We extract the LSOAs for the Huyton Retail Centre easily using the filter function, and then map the extent of the catchment identified by the Huff model.

## Extract LSOAs and catchment for the Huyton Retail Centre
huyton <- filter(lcr_huff_60, rcName == "Huyton")

And extract the corresponding Retail Centre (centroid) too:

## Extract the Huyton Retail Centre Centroid
huyton_rc <- filter(rc, rcName == "Huyton")

Now we can map these LSOAs (tm_fill) and the Huyton Retail Centre Centroid (tm_dots):

## Map the Huyton Retail Centre and Huff Catchment
tm_shape(huyton) +
  tm_fill(col = "orange", alpha = 0.5) +
  tm_borders(col = "black", lwd = 0.25, alpha = 0.2) +
  tm_shape(huyton_rc) +
  tm_dots(col = "orange", size = 0.25) +
  tm_text("rcName", size = 0.75)

But what if we wanted to extract the catchments for each Retail Centre? This can be easily done in R - the operation in a standard GIS software would be a dissolve where you specify the Retail Centre IDs and it would return a dissolved set of LSOAs for each unique value in the Retail Centre ID column.

The equivalent in R is to use the group_by() and summarise() functions together - this has to be piped! We specify that we want to group by Retail Centre (rcID) and then add the summarise command which produces one row per Retail Centre, returning a set of dissolved polygons. We have also added an additional argument to the summarise function which calculates the total number of LSOAs in each catchment (n.lsoa)

Let’s implement this below:

## Extract Huff catchments for each Retail Centre
catchments <- lcr_huff_60 %>%
  group_by(rcName) %>%
  summarise(n.lsoa = n())

What does it look like?

## head() displays the first few rows of any dataframe or tabular object
head(catchments)

## Simple feature collection with 6 features and 2 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: 327048.5 ymin: 385942 xmax: 344662.8 ymax: 410033.4
## projected CRS:  Transverse_Mercator
## # A tibble: 6 x 3
##   rcName              n.lsoa                                            geometry
##   <chr>                <int>                                       <POLYGON [m]>
## 1 Aintree Retail and~      5 ((336680 398495.3, 336680.6 398497.5, 336700.2 398~
## 2 Allerton Road            6 ((339206.3 387899.2, 339198.4 387907.6, 339183 387~
## 3 Belle Vale Shoppin~      6 ((343710.8 389237.7, 343710.9 389237.7, 343810.7 3~
## 4 Birkenhead              21 ((330505 386555, 330500.9 386554.2, 330500.6 38655~
## 5 Bootle                  11 ((335037.4 396037.7, 335037.7 396037.8, 335044.6 3~
## 6 Formby                  13 ((328941.2 406434.6, 328940.1 406434.3, 328939 406~

So we have one row for each Retail Centre, with a dissolved set of LSOA polygons that form the Huff catchment for each centre. Let’s map these to see what they look like:

## Map the Huff Catchments for the Retail Centres and overlay Retail Centres
tm_shape(catchments) +
  tm_fill(col = "rcName", alpha = 0.4) +
  tm_borders(col = "black", lwd = 0.25, alpha = 0.2) +
  tm_shape(rc) +
  tm_dots(size = 0.015, alpha = 0.1, col = "orange") +
  tm_text("rcName", size = 0.75)+
  tm_layout(legend.outside = TRUE, legend.outside.position = "right")

That’s it - you now have Huff catchments for the Liverpool City Region Retail Centres. Be sure to write them out:

## Write out your catchments
# st_write(catchments, "Data/LCR_RC_Huff_Catchments.gpkg")

5.1 Extracting other catchments (optional exercise)

If you are interested, you can repeat this section to extract catchments for the huff model that account for hierarchy (lcr_hierarchy_huff_60), and also for the huff model outputs where we kept probabilities over 40% (lcr_huff_40, lcr_hierarchy_huff_40).

Summary

That’s it! Ok so that’s Part 2 of this practical completed, by now you should have a good understanding of:

What the Huff model is and its basic components.
How to fit a Huff model for the LCR Retail Centres - using the functions available in huff-tools.R.
How to use the results of the Huff model to delineate catchments for centres.

This practical was conceptualised and written using R and RStudio by Patrick Ballantyne (P.J.Ballantyne@liverpool.ac.uk) and Les Dolega (L.Dolega@liverpool.ac.uk). This version was created on 14 May 2021.

The latest version of the workbook is available from https://data.cdrc.ac.uk/dataset/advanced-gis-methods-training-retail-centres-and-catchment-areas and https://github.com/patrickballantyne/CDRC-Retail-Catchment-Training.

Thanks to Dr. Nick Bearman (nick@geospatialtrainingsolutions.co.uk) for his assistance and advice in putting these materials together.

This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/deed.en.