Simulating Urban Flows

Nick Malleson, Andy Evans, Jon Ward & Tomas Crols

Schools of Geography & Mathematics, University of Leeds

These slides:

How many people are there in Traffalgar Square right now?

We need to quantify the ambient population and better understand urban flows:

Crime – how many possible victims?

Pollution – who is being exposed? Where are the hotspots?

Economy – can we attract more people to our city centre?

Health - can we encourage more active travel?

Simulating Urban Flows (surf) - 3 year research project funded by the ESRC


Leeds Institute for Data Analytics (LIDA)

Urban Analytics research stream - very broad!

Focus on population flows and the ambient population

Machine learning approaches

Tracking people (HABITS)

Towards a real-time city simulation ...

Data assimilation

Agent-Based modelling

Modelling Footfall with Machine Learning

Locations of CCTV cameras
Locations of the footfall cameras in Leeds


8 cameras installed between 2007 and 2009

Track movement of people through their field of vision

Provide counts of number of passers-by per hour

Cover a relatively small area of the city centre, not good for wider dynamics.


Analysis of changes in footfall patterns over time

A model of footfall, able to quantify the success of events

Indications of most important drivers

Modelling Footfall with ML

People in the rain

Explanatory factors

Bank Holidays

What type of bank holiday?

School and University Holidays

The weather

Mean temperature, wind speed and rainfall

Day of week

Many others (not tested yet...)

Train prices, car parking availability, business opening times, etc.

Modelling Footfall with ML

Errors from the random forest model

Machine Learning Methods

LIDA Intern - Molly Asher

Ended up supervising me ...

Attempted a number of different methods

Mainly neural networks and random forests

Random forest was most accurate

Modelling Footfall with ML

Feature Importance

VariableRelative Importance
Mean daily temperature1142
Mean daily rainfall 383
After Trinity opened123
School holiday115

Modelling Footfall with ML

Predictive Analytics

EventDateReal FootfallPredictionDifference (%)
Tour de France5-Jul-14346,180217,277-37
Trinity Opening21-Mar-13279,473187,381-33
Xmas lights 201307-Nov-13193,441153,750-21
Xmas lights 201512-Nov-15175,126160,105-9
Light Night06-Oct-16225,660198,025-12

Modelling Footfall with ML


Still needs some refinement, but potentially a useful tool

There are caveats, and it's not so useful for more nuanced analysis

Easy to overfit - e.g. train with 365 days

Not generalisable (after trinity opened)

What about bank holidays that fall during other holidays?

How should roads be re-configured to encourage pedestrians?

Where are most visitors coming from?

How have patterns of use in the city changed?

For this we need more detail about individual movements...


Improved policy to mitigate pollutant and inactivity related health burdens through new big data

Aim: Take new 'Track and Trace' (T&T) data generated from mobile phones to support new policies to:

Reduce the disease burden of pollution

Encourage active travel

Lead by the Institute for Transport Studies, in collaboration with Newcastle City Council and funded by the ESRC

A more nuanced measure of population flows?

GoSmarter logo

Go Smarter

Smart-phone app built in collaboration with Newcastle City Council

Tracks peoples' journeys

Detects when the user is moving and estimates mode of travel

Rewards for using active / sustainable modes of travel

Aim: Demonstrate how the linking of high-resolution location data and other databases / models can support better policy making

Source: Park, Yoo Min, and Mei-Po Kwan (2017). Individual Exposure Estimates May Be Erroneous When Spatiotemporal Variability of Air Pollution and Human Mobility Are Ignored. Health & Place 43: 85–94.

Disease Burden of Pollution

Collaborating with the Newcastle Urban Observatory who are sensing the urban environment

Aim: use T&T data to model urban flows and identify the most serious pollution hotspots.

Source: Park, Yoo Min, and Mei-Po Kwan (2017). Individual Exposure Estimates May Be Erroneous When Spatiotemporal Variability of Air Pollution and Human Mobility Are Ignored. Health & Place 43: 85–94.

Data Caveats

T&T data are

High resolution (spatio-temporal)



How representative of the wider population?

Abundant enough?

Three pillars for modelling & forecasting in urban areas

Big Data - high-resolution information about urban dynamics

Smart Cities - responsive urban infrastructure & policy making

ABM - bring them together?

ABM Problems

1. Computationally Expensive

Not amenable to machine-led calibration

2. Data hungry

Need fine-grained information about individual actions and behaviours

3. Divergent

Usually models represent complex systems

Projections / forecasts quickly diverge from reality

3. Divergence

Complex systems

One-shot calibration

Nonlinear models predict near future well, but diverge over time.

The process of calibration
Typical model development process

3. Divergence

Drawback with the 'typical' model development process

Waterfall-style approach is common

Calibrate until fitness is reasonable, then make predictions

But we can do better:

Better computers

More (streaming) data

Methodological gap

Dynamic Data Assimilation

Used in meteorology and hydrology to constrain models closer to reality.

Try to improve estimates of the true system state by combining:

Noisy, real-world observations

Model estimates of the system state

Should be more accurate than data / observations in isolation.

Ensemble Kalman Filter - Basic Process

1. Forecast.

Run an ensemble of models (ABMs) forward in time.

Calculate ensemble mean and variance

2. Analysis.

New 'real' data are available

Integrate these data with the model forecasts to create estimate of model parameter(s)

Impact of new observations depends on their accuracy

3. Repeat

Ensemble Kalman Filter - Basic Process

Diagram of DDA assimilating data

Experiment with an EnFK

Very simple ABM

People walking along a street

Every hour, x people begin at point A

CCTV Cameras at either end count footfall

Some people can leave before they reach the end (bleedout rate)

Aim: Estimate the number of people who will pass camera B

Diagram of the model environment

Hypothetical 'Truth' Data

Use the model to first generate a hypothetical reality

Results - counts at camera A and B

(Preliminary) Experimental Results

Forecast and analysis are barely distinguishable

Virtual observations are closer to 'truth' than the analysis :-(

This is probably due to the degree of randomness in the model

EnKF estimates the model parameter (bleedout rate) accurately :-)

Simulating Urban Flows (surf)

Aim: Create an agent-based model capable of representing the human flows in a real city.

Calibrated using streaming data dynamically

Hoping for European Research Council funding to continue the work


Some work from the Simulating Urban Flows project

Machine learning approaches

Tracking people (HABITS)

Dynamically calibdated ABMs

Towards a real-time city simulation ...

Simulating Urban Flows

Nick Malleson, Andy Evans, Jon Ward & Tomas Crols

Schools of Geography & Mathematics, University of Leeds

These slides: