An understanding about how people move around cities is vital for building up reliable estimates of the population at risk for phenomena that vary in space and time. For example, to reliably assess the disease burden of air pollution or the risk of crime victimisation, it is important to understand when and where the activities of individuals intersect with areas of potential harm. However, modelling dynamic populations is extremely difficult as most well-established data sources contain very little information about non-residential activities. Fortunately, emerging ‘big' data sources such as those arising through the use of mobile telephones, social media, or loyalty cards hold the promise of providing more reliable information about non-residential daily activities. The challenge, therefore, is to create a high-resolution model of urban flows that is able to take advantage of good quality residential and non-residential data sources.
This presentation will discuss an ongoing research project called Simulating Urban Flows (surf) that is using agent-based modelling as the tool to bring together varied data sources and create an accurate, high-resolution model of individual-level daily urban flows. Its vision is for a comprehensive model that is able to simulate the movements of individual synthetic people and subsequently build up accurate, dynamic footfall estimates. These will be compared to air pollution hotspots and to crime patterns to better understand how mobile populations are affected by these problems.
Ambient populations - why they are important
Implications for crime and pollution
How to measure ambient populations
A better way (?) to model the ambient population: agent-based modelling
Data assimilation for ABM
Conclusions
Education
BSc: Computer Science
MSc: GeoInformatics
PhD: (Computational) Human Geography
Now: Associate Professor in Geographical Information Science
Agent-Based Modelling
Spatial analysis and simulation
Crime patterns / environmental criminology
Big data (particularly for social simulation)
3-year research fellowship, funded by the ESRC (UK)
Build an agent-based simulation of daily urban dynamics
Calibrated using a combination of traditional sources (e.g. census) with dynamic, crowd-sourced data
Explosion in data volume.
'Datafication' of hitherto private thoughts/actions.
Transformational impact on social sciences.
Smart cities & a new generation of models to understand cities
A greater role for (e.g.) ABM
Surprisingly poor data to quantify mobile populations
Difficulties in designing policy
E.g. urban renewal / regeneration
Intellectual interest
E.g. Equality, accessibility, mobility
What is the most appropriate denominator for crime rate calculations?
Residential population is the most common
But not always appropriate
Daily flows of people significantly impact crime
Difficult to quantify hotspot severity without good population at risk estimates
Largest cause of preventable deaths (WHO)
Improved models of pollution generation and dispersal
Including individual polluters (Nyhan et al., 2016)
But still relatively weak estimates of the population-at-risk
Travel diaries (& GPS tracking)
Calculate personal exposure (Yoo et al., 2015)
Travel surveys
Coupled activity model and dispersion model (Beckx et al., 2009, Dhondt et al, 2012, Setton et al., 2011)
Bulk mobile phone activity
. . .
Improved pollution models
Nyhan et al. (2016): large taxi data set to estimate street segment pollution (highly detailed: factored in acceleration!)
Consistent finding: Residential models underestimate exposure
Opportunity to use simulation to:
Combine data sources
Scale-up smaller surveys
The ambient population is important
But how to quantify it?
And how to better understand urban flows?
How to quantify the ambient population and urban flows?
Large population coverage
Private, unknown methodology, privacy concerns, coarse resolution (?)
How to quantify the ambient population and urban flows?
Smart-phone apps that capture movement / location are becoming ubiquitous
Great potential for understanding (some) urban dynamics
Skewness
Prolific users distort patterns
Representation
Online & public ≠ offline & private
Spatial accuracy
Bias
Participation inequality and the digital divide
Complicated!!
Messy, and "too big for Excel"
Volume
Potential for large sub-samples
Velocity
Streaming / regularly updated
Potential for dynamic models
Model the individual components that drive system behaviour directly
Autonomous, interacting 'agents'
Can model emergence, non-linearity, and other features of complex systems
Basic execution process
t=0 (initialisation) : create a population of agents and their environment
Each agent has variables that represent their state and rules to control their behaviour
t+1 : Each agent executes its behavioural rules and updates its state
This can involve moving, interacting with other agents, performing an action etc.
Birks et al. (2012)
Randomly generated abstract environments
Theoretical 'switches'
Rational choice perspective
Routine activity theory
Geometric theory of crime
Validation against stylized facts:
Spatial crime concentration
Repeat victimisation
Journey to crime curve
Modelling complexity, non-linearity, emergence
Natural description of a system
Bridge between verbal theories and mathematical models
Produces a history of the evolution of the system
Stochasticity
Computationally expensive (not amenable to optimisation)
Complicated agent decisions
Lots of decisions!
Multiple model runs (robustness)
Modelling "soft" human factors
Need detailed, high-resolution, individual-level data
3-year research fellowship, funded by ESRC (UK)
Build an agent-based simulation of daily urban dynamics
Calibrated using a combination of traditional sources (e.g. census) with dynamic, crowd-sourced data
New insights into urban mobility patterns and footfall estimates.
Simulate daily urban dynamics
Shopping, commuting, education, etc.
Better understand daily urban mobility patterns
(Exploring the use of Improbable's SpatialOS)
http://surf.leeds.ac.uk/announce/2015/12/10/ImprobableSim.html
Incorporate data into models dynamically (c.f. meteorology models)
Preliminary example using an Ensemble Kalman Filter (EnKF)
Novel for ABM
Leeds Institute for Data Analytics (LIDA)
Consumer Data Research Centre (CDRC)
Multi-million £ investments from Leeds and UK research councils
Collaborative space for big data analytics
Attract expertise from medicine/health, computer science, geography, mathematics, business ...