Neighbourhoods are more than just where people live – they are spaces of movement, interaction and access. Traditional geodemographic classifications, such as the UK Output Area Classification (OAC), provide valuable but static portraits of residents at night-time addresses. They cannot show how neighbourhoods actually function during the day or how people connect to them.
The GCI dataset provides the first geodemographic classification of neighbourhood interactivity. Uniquely, it integrates dynamic mobility, connectivity and accessibility indicators with conventional Census-based socio-economic measures. This creates a richer picture of how places in Greater London operate over the 24-hour day.
The classification is built at OA level for Greater London using:
- Census 2021 variables on age, household composition, tenure, occupation and health.
- Connectivity and footfall metrics derived from anonymised, GPS-based origin–destination flows and diurnal activity patterns recorded between 2016 and 2019 (approximately 380,000 mobile users and 21 million journeys).
- Service accessibility measures based on multimodal travel times to jobs, schools, hospitals and supermarkets (OpenStreetMap + GTFS).
Using these data, OAs were segmented into seven distinct clusters of neighbourhood types. Each captures a unique combination of socio-economic composition, daytime/night-time activity, transport connectivity and service access. Full pen portraits are available in the technical report, and interactive maps of the clusters are available through Mapmaker.
This classification extends what is possible using Census data alone. By linking georeferenced, anonymised mobility records to neighbourhood characteristics, the GCI captures how neighbourhoods function and who they serve, rather than relying solely on where residents sleep at night.
This independent, ethically approved GeoDS Research Ready Data product demonstrates a practical framework for integrating dynamic indicators into geodemographic classification. It enables planners, researchers and policy-makers to explore accessibility gaps, daytime population flows and the night-time economy at a much finer scale than before – all while maintaining the highest standards of data protection and ethical research practice.
Content
The data are provided in CSV format at OA level for Greater London. Additional resources, including a detailed glossary of variables, cluster descriptions and a technical report are also available for download.
This dataset applies K-Means clustering with decile-scaled, input variables (52 indicators across two domains) to produce stable and interpretable cluster assignments. All processing steps and methodological details are documented in the accompanying technical report.
Quality, Representation and Bias
The GCI consists of variables based on anonymised GPS-derived movement records from a sample of London residents between 2016–2019. While the sample is generally representative, it covers only a proportion of total movements and excludes some groups (e.g., those without smartphones). Cross-app or out-of-London movements are not observed, which may lead to underestimation of some flows. Nonetheless, the main patterns of movement and connectivity across London are well captured, making the dataset suitable for developing a robust classification. Further methodological details and validation results are provided in the accompanying technical report.