London aggregated in-app Origin-Destination Flows

The aggregated origin–destination (OD) flow dataset provides a detailed view of human mobility patterns across Greater London in 2019. It is derived from individual-level mobile phone location data, capturing movements between MSOAs over the course of the day. Each flow represents the number of individuals moving from one MSOA to another within a two-hour threshold, allowing researchers to explore both the volume and direction of interactions between neighbourhoods. By aggregating individual trajectories to the MSOA level and removing low-count flows, the dataset maintains strict privacy and disclosure control, making it suitable for research and policy applications.

This temporal and spatial granularity enables the study of commuting patterns, local activity, and broader urban connectivity, providing insights into how the functional roles of areas shift throughout the day. The dataset is particularly valuable for applications in transport planning, urban analytics, geodemographics and social research, as it moves beyond static residential or workplace data to capture the dynamic interactions that shape city life.

Content

The dataset is provided in CSV format and contains aggregated origin–destination flows between MSOAs in Greater London. Each row represents the sum of flows from an origin MSOA to a destination MSOA during a specified hour of the day. The dataset includes the following fields: origin MSOA, destination MSOA, hour of day, and the total number of flows. Flows of less than ten have been removed to ensure disclosure control.

The data are organised to allow analysis of interactions at an hourly level across all MSOAs, enabling comparisons between areas, examination of peak movement times, and exploration of connectivity patterns within the city. The structured format supports straightforward integration with other spatial or demographic datasets for research purposes.

Quality, Representation and Bias

The dataset provides a consistent spatial representation of mobility patterns across Greater London, but several factors affect its quality and coverage. Location data are derived from a self-selecting sample of mobile app users, so there may be socio-demographic biases in the representation of the population. Certain groups, such as children, the elderly, individuals without smartphones, or those with non-standard work patterns (e.g., shift workers, gig economy workers), may be under-represented. Conversely, areas with high smartphone activity, such as commercial or retail zones, may be over-represented, potentially exaggerating flow volumes.

OD flows reflect sequential movements between detected activity locations but cannot fully distinguish interim stops from final destinations. While the data capture major commuting and activity patterns, they may not fully reflect the complexity of individual routines, such as multiple short visits, dispersed work locations, or irregular mobility.

A two-hour cut-off is applied to define a flow, which balances capturing meaningful movements and excluding unrealistic gaps, but may omit longer legitimate trips or flows from slower-moving populations. Lastly, flows are aggregated to MSOAs to mitigate GPS inaccuracies and preserve privacy, but this introduces potential modifiable areal unit problems (MAUP). Aggregation may obscure finer-grained spatial patterns, and alternative representations (e.g., hexagonal grids or interaction-based zones) could reveal different connectivity structures.

Despite the considerations outlined above, which primarily serve to guide the correct interpretation and use of the data, the dataset remains a highly valuable resource. It preserves the spatial and temporal granularity of origin–destination movements better than standard aggregated products, enabling analysis of hourly flows, connectivity patterns, and urban interactions. While it does not capture the entire population, the dataset provides a robust and privacy-safe approximation of mobility trends, filling a critical gap in available data products by offering detailed origin–destination flows derived directly from individual-level mobility data. This makes it particularly useful for research, policy analysis, and urban planning applications.

Open

Data and Resources

Additional Info

Field Value
Source Huq
Author Mavrogeni, Mikaella
Maintainer Maurizio Gibin
Last Updated November 24, 2025, 13:58 (UTC)
Created November 5, 2025, 17:25 (UTC)
Attribution The data for this research have been provided by the Geographic Data Service (geods.ac.uk), a Smart Data Research UK Investment: ES/Z504464/1.
Frequency Snapshot
Granularity MSOA11CD
Spatial Coverage Greater London
Temporal Coverage January 2019 to December 2019