Abstract
City digital twins (CDTs), as digital replica of urban systems and development processes, have been heralded as the next-generation technology for urban planning and management. Arguably, the concept of CDTs is not new. Prior to CDTs, applied urban modelling has been playing a pivot role in supporting city and infrastructure planning since the 1960s. Examining CDTs in relation to conventional urban models can thus offer valuable insights into their nature, potential, and challenges. Such a comparative, reflective exercise, however, remains rare. This commentary aims to share insights and reflections from a dedicated applied urban modelling (AUM) community. It is argued that to substantiate the power of CDTs, a theory-driven modelling strategy is essential for both practical policy analysis and knowledge discovery. Modellers must think beyond the technical perspective and exploring novel use of CDTs beyond optimisation. A blind pursuit for data without building on and expanding existing domain knowledge remains an existential risk for CDTs.
Keywords
Introduction
City digital twins (CDTs), as digital replicas of urban systems and development processes, are being heralded with some reservations as the next-generation technology for urban planning and management. Arguably, the concept of CDTs is not new but has evolved alongside many other developments in urban modelling over many years (Wan et al., 2023: p. 5). Prior to CDTs, land-use and transport interaction (LUTI) models, introduced by pioneers such as Lowry (1964), Echenique et al. (1969), Batty (1976), and Wegener (1994), have been playing a crucial role in supporting city and infrastructure planning since the 1960s. The rise of microsimulation (Moeckel, 2017; Waddell, 2002; Zhu and Ferreira, 2014) and agent-based models (Heppenstall et al., 2011; Long and Zhang, 2015) has further broadened the scope of urban systems modelling. These advances have not only enhanced the capabilities of traditional models but also established applied urban modelling as a prominent research domain within the broader field of the science of cities (Batty, 2013). This continuous evolution reflects the growing sophistication and integration of various modelling techniques, paving the way for CDTs to become a vital tool in understanding and managing urban environments.
Examining CDTs in relation to conventional urban models can thus offer valuable insights into their nature, potential, and challenges. Such a comparative, reflective exercise, however, remains rare. The discourse around CDTs is often dominated by emerging publications that emphasise their revolutionary, albeit sometimes overstated and unsubstantiated, capabilities. Notably, this journal has been a leading academic platform for critical discussions of CDTs (Batty, 2018, 2019a, 2019b; Fotheringham, 2023; Malleson et al., 2024). Batty’s (2024) recent review highlighted several key conceptual and practical issues related to CDTs, offering critical and constructive perspectives grounded in the extensive history of urban modelling.
This commentary aims to further expand the CDT discussion by sharing insights and reflections from a dedicated applied urban modelling community, which has been convened at the annual Applied Urban Modelling (AUM) symposia since 2011 at the University of Cambridge. Key arguments presented in this paper are gathered and expanded from interviews and conversations with long-standing AUM scientific committee members and the convenors (Marcial Echenique, Michael Batty, Michael Wegener, and Ying Jin).
Role of theories in CDT development
CDTs are often framed as a generic, system/scale-agnostic technology underpinned by big data and machine learning (ML) techniques. Such system/scale-agnostic features are characterised by using generic statistical methods for generating predictions based on a large number of predictors and parameters. This contrasts to more conventional urban modelling methodologies based on theories rooted in causal reasoning and domain knowledge. The question thus arises: ‘Would the rise of ML-based CDTs lead to a paradigm shift in urban modelling?’
Probably not. For conventional urban models, particularly those featured at AUM symposia with a long-standing track record (Anas and Liu, 2007; Batty, 2021; de La Barra et al., 1984; Echenique et al., 1969, 2013; Hunt and Abraham, 2005; Jin et al., 2013; Simmonds, 1999; Waddell, 2002; Wegener, 1982, 1994), theory-based model design and calibration are essential prerequisites for their application in practical policy analysis. This necessity arises because policy appraisal models must demonstrate proven external validity and generality, which require a clear and explicable causality. Identifying such causality involves a structural understanding of the intricate relationships among numerous factors and actors within urban systems. This understanding is often developed progressively, scrutinised, curated, and utilised by relevant stakeholders.
This domain-specific and collectively retained knowledge base is vital as it not only legitimises and lends credibility to these models as decision-support tools but also ensures accountability on both modellers and model users through the regulation of data collection and use, the design and calibration of models, and the interpretation of model results. This rigorous approach helps maintain the integrity and reliability of models, making them trustworthy for informing policy decisions.
In contrast, models based solely on generic data fitting techniques tend to utilise data in a passive manner without a substantive theoretical framework for identifying possible biases and errors in training data. Deficiencies in the training data could propagate through the fitting process, particularly in the case of overfitting, leading to problematic predictions (Cawley and Talbot, 2010). Despite outstanding internal validity (based on validations using out-of-sample yet post hoc data), the model’s ability to predict the effects of new shocks or interventions – highly context-sensitive events that may not be captured in any post hoc data – is often rather limited (Coveney et al., 2016). Even if such effects are partially captured in post hoc data, simplistic statistical extrapolations without explicating the underlying causal mechanisms render them inadequate for practical policy use.
Batty (2024) criticised the dominant emphasis in current CDT applications on replicating the physical aspects of cities, neglecting the underlying political and socio-economic processes. This critique highlights the important role of theories in CDT development. While generic statistical methods are theoretically sound within the field of statistics, they do not constitute
To effectively incorporate socio-economic processes into CDTs, it is necessary to draw upon theories from a broad spectrum of disciplines, including economics, political science, sociology, and geography. The recent advancements in causal ML and explainable artificial intelligence (AI) offer promising new tools for tackling certain policy issues, notably short-term, local-scale traffic management, which may be difficult to address using traditional modelling techniques. However, the effectiveness of ML in these use cases depends significantly on the nature of the policy problem at hand. ML approaches are most useful when the policy problem and solution space resemble a well-definable, closed engineering system. Unfortunately, most societal challenges do not fit this description. Consequently, existing theories and domain-specific knowledge are expected to remain vital. They serve as essential benchmarks for validating these novel ML methods and guiding their practical applications. This theoretical foundation ensures that CDTs are not only technologically advanced but also politically and ethically applicable.
CDTs should inform data collection
Models are only as good as the data they seek to explain. On the one hand, existing measurements of various urban features determine what we can predict and validate. On the other hand, these measurements reflect our evolving yet limited understanding of the structural characteristics of the underlying system. In applied urban modelling, the calibration-validation loop is crucial not only for ensuring model validity but also for refining underlying theories and discovering new knowledge. This process reveals gaps and limitations in existing measurements, devises modelling techniques to mitigate such data limitations, and informs future data collection, which in turn enables better model application. This creates a positive feedback loop between the data and the model. Such a progressive, iterative process of knowledge discovery is a unique strength of all theory-driven approaches.
The CDT’s seemingly inherent preference for real-time, yet often unstructured, sensing data represents a notable deviation from the traditions in applied urban modelling. CDT algorithms that utilise real-time data tend to focus on identifying patterns (feature engineering) and/or predicting future trends, rather than evaluating the data against a theoretical benchmark. Model performance is typically assessed using generic statistical metrics, rather than a substantive check of the underlying causal mechanisms.
DT developers from the field of engineering might argue that increasing the coverage, frequency, and accuracy of instrumented data could help to unpack complex, closed engineering systems. However, existing big data sources for cities, despite unprecedented in terms of their volume and velocity, may still be ‘tiny’ relative to the sheer complexity of urban systems and processes (Caldarelli et al., 2023; Coveney et al., 2016). Further expanding the scope of data collection into personal domains is subject to significant privacy and ethical concerns. The effectiveness of a purely data-driven approach in CDT development might thus be overstated. Addressing data limitations for CDT development requires more than merely collecting more data; it necessitates progressively improving our understanding of the underlying mechanisms. Given the substantial knowledge gap in understanding how cities function (Batty, 2024) and rising concerns about data privacy and AI ethics, it is thus questionable to assume that CDTs can and should pursue the same level of analytical and automation capability seen in engineering and manufacturing sectors where theory and models are largely focussed on closed, physical systems. Enhancing a positive feedback loop between data and CDT models, particularly by informing data collection through theory-driven model calibration and validation, seems a promising way to advance CDT models.
Addressing participation and experimentation in CDT models
Forecasting has been the primary use of conventional urban models, often through ‘what-if’ scenario analysis rather than producing single predictions of the future. These models have also been employed to generate optimal solutions based on user-defined objective functions and constraints. However, addressing societal problems such as the housing crisis and climate change requires more than just predictions and optimisations as these issues are inherently ‘wicked’ problems that tend to become more severe as they are probed (Rittel and Webber, 1973).
In the particular context of urban policy modelling, this means that developing a consensual policy problem definition and identifying a definitive solution space are virtually impossible. These problems are inherently politicised, and their resolution often lies outside the domain of technical modelling efforts. Moreover, politicised views (e.g. the role of public housing in addressing the housing crisis) and assumptions (e.g. the future distribution of growth between locations and population groups) are frequently embedded in the models designed to tackle these issues. Consequently, the ‘optimal’ solutions produced by such models are influenced by these views and assumptions. The issue is further exacerbated by the fact that the conditions under which model results are interpreted are often implicit to the public. Thus, pursuing a singular and apolitical optimisation through modelling is an inherently flawed approach to resolving societal challenges.
Reflections from the AUM community suggest that the resolve, intelligence, and actions required to implement policy changes have always originated from an informed public-at-large. A model itself is not an agent. Instead, active and innovative stakeholder engagement has been a hallmark of effective modelling projects in the AUM community. For instance, a recent modelling study for Greater Cambridge played a crucial role in connecting and coordinating distinct and often conflicting local development aspirations among a wide range of stakeholders. This was accomplished by identifying shared interests, risks, visions, and priorities through quantified impact analysis (CPIER, 2018; Nochta et al., 2021). In many ways, the Greater Cambridge model did not aim to provide specific policy solutions but served as a vehicle for expressing and mediating human agency. This approach has facilitated a collaborative environment where diverse perspectives could be integrated into a coherent policy framework, highlighting the model’s role as a tool for dialogue and consensus-building rather than a provider of definitive answers.
The implication for CDTs is that the search for ‘big theories’ (Coveney et al., 2016) should extend beyond the technical domain to include theories of knowledge, knowledge management, and participation in city planning and management (Wilson et al., 2019). Stakeholders and actors should not be viewed merely as agents to be consulted and represented in the modelling process, nor as subjects to be managed or optimised through automated control, as some CDT suppliers suggest. Instead, CDT development should prioritise meaningful engagement and participation, which can reshape stakeholders’ perceptions and inform their decisions on policy options (Dembski et al., 2020; Thompson, 2023).
Compared with conventional stakeholder engagement methods, CDT models seem to offer a unique advantage by enabling an experimental approach in policymaking. This potential is particularly relevant for exploring policy interventions that are considered radical and controversial but are essential for driving key institutional and behavioural changes, such as pedestrianisation and emissions and congestion charges. For these initiatives, ex ante prediction of behavioural responses and the wider impacts on distinct stakeholder groups are often technically demanding, politically unpalatable, and financially unviable.
CDTs can facilitate a more interactive, participatory approach through relatively frequent ‘trial and error’ processes. For example, in closing a school street to improve local air quality, it is critical for policymakers to assess the feasibility and magnitude of mode shift, as well as the distributional effects across population groups and locations. These elasticities are typically highly context-sensitive and variable, yet they are required as key inputs in conventional model-based impact assessments. A CDT can, in theory, estimate such elasticities at an appropriate spatio-temporal granularity using real-time data and support the fine-tuning of interventions in an experimental manner. Many cities, having invested in digital infrastructure and upskilling, particularly since the COVID-19 pandemic, may now possess the technical capabilities needed for such experiments. This digital readiness enhances the potential for CDTs to support innovative policy solutions by enabling dynamic and context-specific adjustments based on real-time feedback.
Concluding remarks
Applied urban models have played a pivotal role in supporting city planning and development. As a culmination of past modelling aspirations for cities, CDTs can and should draw insights from conventional urban modelling efforts. This perspective from the AUM community re-emphasises the importance of a theory-driven modelling strategy for practical policy analysis and knowledge discovery. Purposeful theory building and testing through CDTs can inform data collection, thereby improving the fidelity and value of CDT models.
CDTs offer a unique advantage in facilitating participation and experimentation in city planning and management. They should enhance, rather than replace, human agency. A blind pursuit of data, without building on and expanding existing domain knowledge, poses an existential risk to CDT development.
Footnotes
Acknowledgement
Professor Michael Wegener, a pioneer of urban modelling and our co-author, was involved in this Commentary but passed away before its publication. We would like to dedicate this work to his memory.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
.
.
