Abstract
Contemporary urban environments are increasingly shaped by intertwined layers of infrastructure, data, and regulation that produce spatial patterns difficult to interpret through single-modality analyses or purely predictive models. While recent AI-driven urban studies have advanced large-scale measurement and forecasting, they often struggle to (i) integrate heterogeneous modalities such as imagery, geospatial networks, archives, and planning documents, (ii) relate spatial change to regulatory and institutional timelines, and (iii) generate traceable outputs that support architectural and planning reasoning. This study addresses these limitations by proposing an interpretable, multimodal framework for urban pattern interpretation.
The paper introduces City Decoder, a spatiotemporal pattern-recognition framework designed to decode urban environments through evidence-linked interpretation rather than prediction or optimisation. Methodologically, the research conducts a structured comparative review of representative trajectories from major global city laboratories—including Beijing City Lab, MIT Senseable City Lab, ETH Future Cities Laboratory, and Stanford Urban Informatics Lab—coding their dominant data modalities, analytical tasks, and output types. Based on this synthesis, City Decoder is specified as an operational pipeline that aligns time-series urban imagery, vector geospatial data, and regulatory texts within a shared spatial–temporal reference. The framework performs recurrence detection, cross-layer coupling analysis, and discontinuity identification between policy intent and material outcomes.
The framework is designed to produce a Pattern Atlas, relational mappings, and diagnostic reports intended to support architectural and urban analysis, policy evaluation, and scenario-based design inquiry. The primary contribution is a transferable methodological specification for interpretive, traceable pattern recognition in urban research—one that repositions multimodal AI from a predictive tool toward an instrument of spatial understanding.
Introduction
Recent advances in artificial intelligence have significantly transformed the ways urban environments can be measured, classified, and modelled at scale. In contemporary urban research, machine learning techniques are increasingly employed to extract patterns from satellite and street imagery, detect land-use change, analyse mobility networks, and support large-scale environmental monitoring. While these approaches have produced valuable insights, they are largely oriented toward prediction, optimisation, or automated classification. As a result, they often provide limited support for interpretive urban reasoning—that is, the ability to relate observed spatial patterns to the infrastructural, regulatory, and institutional conditions that produce and transform urban space.
This limitation becomes particularly evident in complex urban territories shaped by multiple overlapping systems. Architectural and urban design discourse has traditionally approached the city through distinct analytical lenses, such as morphology, infrastructure, governance, and environment—each examined as a separate layer. Yet many of the most consequential urban transformations occur precisely where these layers intersect: logistical corridors, peri-urban thresholds, infrastructural landscapes, and residual urban fragments. These in-between territories are often defined by unstable relations between regulatory frameworks, ecological processes, and everyday practices, producing spatial conditions that resist single-form or single-logic explanations. Engaging with such environments requires methodological approaches that operate across multiple spatial scales, temporalities, and heterogeneous data modalities.
This paper addresses that need by proposing City Decoder, an interpretable hybrid AI–geospatial framework for multimodal spatiotemporal pattern recognition. Rather than treating the city as a system to be optimised or forecast, the City Decoder is conceived as a framework for evidence-linked decoding. It aims to make spatial patterns readable by relating them to the infrastructural relations, regulatory sequences, and spatial behaviours that co-produce them. In this sense, the framework is intended to support architectural and planning reasoning by generating traceable and interpretable outputs, rather than opaque predictions.
The proposal builds on the premise that contemporary urban environments increasingly operate through forms of “code.” Historically, code referred to codified laws and regulations that inscribed authority into spatial form. In contemporary urban conditions, code also includes software-mediated infrastructures and data-driven governance systems that shape circulation, access, visibility, and control. Across urban theory and science-and-technology studies, scholars have argued that code is not an external layer applied to space but a material force in spatial production. Within architectural discourse, related traditions have approached the city as a legible system, a text, or a patterned language—suggesting that urban coherence emerges through recurring relationships rather than isolated forms. City Decoder draws from these conceptual positions but shifts the emphasis from metaphor to method: it treats “code” as a relational condition to be decoded through multiple forms of evidence distributed across space and time.
Methodologically, the paper proceeds from the observation that major global city laboratories have developed powerful techniques for extracting urban patterns, yet these approaches frequently remain modality-specific or output-oriented, focusing on prediction, simulation, or visualisation without consistently linking spatial patterns to regulatory and institutional sequences. To clarify this landscape and position the proposed framework, the study conducts a structured comparative review of representative research trajectories from major global city laboratories. Each case is coded according to its dominant data modalities, analytical tasks, and output types. This synthesis reveals a methodological gap: while many approaches excel at measurement and prediction, fewer provide tools for cross-modal alignment, temporal coupling between spatial change and governance, and traceable interpretive outputs.
Against this background, the paper introduces City Decoder as an operational framework. It is specified as a pipeline that aligns heterogeneous datasets—such as time-series urban imagery, vector geospatial networks, and regulatory texts—within a shared spatial–temporal reference. Through this alignment, the framework performs three core analytical operations: (i) recurrence detection across temporal layers, (ii) cross-layer coupling analysis between spatial, infrastructural, and regulatory domains, and (iii) discontinuity identification between policy intent and material outcomes. The framework is designed to produce a Pattern Atlas, relational mappings, and diagnostic reports intended to support architectural analysis, policy evaluation, and scenario-based design inquiry.
This leads to the paper’s central research question: How can hybrid AI–geospatial models assist in decoding coded urban environments by revealing relational patterns across heterogeneous spatiotemporal data in a form that remains interpretable and evidence-linked for architectural and planning reasoning?
The paper makes three contributions. First, it proposes a conceptual reorientation of urban AI from prediction toward interpretive, traceable pattern recognition, grounded in a comparative synthesis of global city-laboratory methodologies. Second, it specifies City Decoder as a transferable framework, defining its input modalities, three analytical operations, and diagnostic output types. Third, it demonstrates the framework’s analytical logic through a structured illustrative application to the Kartal district, Istanbul, showing how cross-layer coupling and discontinuity detection surface relational patterns that no single-modality analysis could produce.
The paper is organised into four sections. The next section reviews key theoretical debates on code, urban pattern reasoning, and interpretive approaches to spatial analysis. The following section presents the comparative analysis of global city laboratories and extracts a structured typology of their methodological trajectories. The fourth section specifies the City Decoder framework and its analytical pipeline, and includes an illustrative application of the framework’s three analytical operations to the Kartal district, Istanbul, demonstrating the kind of relational patterns and diagnostic outputs the framework produces when applied to real data. The final section concludes by summarising the contributions and discussing implications for urban research and practice. Figure 1 presents an overview of the City Decoder framework.

City Decoder framework.
Literature review
Research on the city as a dynamic and relational system has deep conceptual roots at the intersection of architecture, urban studies, and information technologies. While the recent rise of artificial intelligence and large-scale urban data has intensified this research direction, its theoretical foundations predate contemporary computational approaches. This literature review situates the present study within three interconnected trajectories: (i) the conceptualisation of urban space as a coded system, (ii) the development of computational and data-driven urban analysis, and (iii) pattern-oriented approaches in architectural and urban theory. Together, these strands form the epistemological and methodological basis for a decoding-oriented framework.
Code as a spatial and relational system
The conceptual foundation of this study rests on the notion of code as both metaphor and governing mechanism, an idea that has evolved across cryptography, law, linguistics, and spatial theory. In its technical and historical sense, code refers to a cryptosystem operating on linguistic units, transforming meaning through processes of encryption and translation. Etymologically derived from the Latin codex, meaning a bound collection of laws, the term historically referred to the inscription of authority into material form.
This legal–material genealogy is extended by Mitchell (1995), who famously asserted that “code is law,” arguing that computational systems reorganise social and spatial orders through executable scripts. Hillier and Hanson (1984) similarly describe code as an underlying system of rules that structures spatiotemporal events, suggesting that such rules precede and organise spatial interactions. Kitchin and Dodge (2011) further demonstrate that software, though often invisible, exerts material agency over spatial production by shaping infrastructures, services, and everyday routines. Collectively, these interpretations position code as an operative structure—an invisible architecture that regulates how both information and space are produced and experienced.
Bernstein (1981) extends this logic by defining code as a regulative principle that selects and integrates meanings and their realisations. In this view, code mediates between contexts and the relationships within them, governing both what meanings can coexist and how they are expressed. This relational interpretation situates code as a system of translation linking the social, the semantic, and the spatial.
Within spatial theory, Lefebvre advanced this position by describing the city as a coded yet unfinished text. Rejecting structuralist conceptions of the city as a closed signifying system, he argued that “the city cannot therefore be conceived as a signifying system, determined and closed as a system” (Lefebvre, 1968 [2003]). Instead, urban space is continuously rewritten through social practice: “the city writes itself on its walls and in its streets, but that writing is never completed” (Lefebvre, 1970, 1974 [2003]). In this sense, the city becomes a code in constant revision—a living script that records, erases, and rewrites itself over time.
From spatial code to computational urban analysis
The idea of spatial lawfulness and systemic order was later formalised through analytical and computational approaches. Alexander et al. (1987) argued that cities evolve through interacting layers of spatial rules, emphasising the need to understand the laws that produce urban wholeness. Hillier and Hanson (1984) translated similar principles into a formal methodology through space syntax, which treats urban space as a language whose grammar can be analysed through patterns of connection and integration. By relating spatial configurations to social variables, space syntax introduced one of the earliest computer-aided analytical approaches to urban form.
Building on these developments, urban computing emerged as a data-driven paradigm for analysing cities. Zheng et al. (2014) define urban computing as the acquisition, integration, and analysis of large, heterogeneous datasets generated by multiple urban sources and agents. These approaches successfully merge computation and urban analysis but often frame the city primarily as a measurable and optimizable system, rather than as an interpretive or relational field (Batty, 2013).
The notion of the dynamic city gained further traction through the work of Batty (1994), who used fractal geometry to describe urban form as a real, measurable structure rather than an abstract ideal. Later, Batty (2013) articulated the concept of the computable city, arguing that urban understanding emerges from the relations between places rather than from the intrinsic attributes of individual objects. This relational perspective forms a key foundation of the emerging science of cities (Batty, 2012). However, much of this work remains oriented toward simulation, prediction, and formal modelling, leaving open the question of how computational methods might support interpretive reasoning about urban transformation.
Pattern-based reasoning in architectural theory
Parallel to these computational developments, architectural theory has long approached the city as a coded and patterned environment. Lynch (1960) introduced the concept of urban legibility as a cognitive code through which inhabitants read spatial order. Van Eyck, in the Team 10 Primer (Smithson, 1968), described code as a relational syntax mediating time, movement, and experience. Rossi (1982) conceptualised the urban locus as a coded memory of collective identity. Tschumi (1996) further identified new codes of assemblage emerging from sequences of events, programs, and movements within the contemporary metropolis.
Among these contributions, Christopher Alexander’s work remains central to the idea of pattern-based spatial reasoning. His concept of patterns, understood as recurring social–spatial relationships forming a language of form and structure, provided a systemic approach to design. In A Pattern Language (Alexander et al., 1977), he defined each pattern as a recurring problem accompanied by a solution adaptable to different contexts. However, Alexander later argued that patterns alone could not generate living environments; instead, he proposed a morphogenetic understanding of urban formation that links design processes to evolving spatial structures (Alexander et al., 1987). His question—what kinds of laws, and at what levels, are required to produce urban wholeness—remains central to contemporary systemic approaches to design.
Watanabe (2002) extended these ideas by advocating bottom-up decision-making processes that define relationships between parts while maintaining overall coherence. This perspective aligns with complexity science, which reveals the simple rules underlying apparently chaotic systems. Within this framework, the theories of Alexander’s pattern language, Hillier’s space syntax (Hillier, 1989), and Batty’s computable city can be understood as related attempts to interpret the city as a self-organising and evolving system.
Toward a decoding-oriented framework
Across these trajectories, a shared conceptual thread emerges: the city is repeatedly described as a system structured by rules, relations, and recurring patterns. However, existing computational approaches often emphasise measurement, prediction, or simulation, while architectural theory has traditionally focused on conceptual or formal interpretations of spatial codes. What remains underdeveloped is a methodological framework capable of decoding urban patterns across heterogeneous datasets, linking spatial configurations to regulatory, infrastructural, and institutional processes over time.
Artificial intelligence research defines its central task as recognising patterns within large volumes of data and making them visible or actionable (Batty, 2024a, 2024b). Yet, within urban research, this pattern-recognition capacity is frequently directed toward forecasting or optimisation rather than interpretive understanding. The present study builds on this gap by proposing a decoding-oriented approach: a framework that uses multimodal pattern recognition not to predict future states, but to reveal the relational structures and temporal dynamics underlying urban transformation.
The following section examines contemporary global city laboratories and their methodological trajectories, providing a comparative basis for the specification of the City Decoder framework.
Comparative analysis of global city laboratories
Despite decades of technological progress, artificial intelligence research has largely evolved within linguistic and symbolic domains rather than spatial ones. As Zhao et al. (2025) note, language models have advanced from simple text generators to complex systems capable of reasoning and problem-solving, yet they remain grounded in textual data and generative processes. This dominance of language-based systems has shaped analytical cultures that privilege semantic patterning over spatial interpretation.
Batty (2024a, 2024b) observes that most neural architectures are trained to reproduce patterns within data rather than to explain or construct them, generating outputs that mimic observed conditions without exposing their underlying logic. He later distinguishes between discriminative models, which read existing conditions, and generative models, which create new possibilities under uncertainty (Batty, 2025). Yet both paradigms often operate within isolated modalities, focusing on singular data streams rather than the overlapping layers that characterise urban environments. Cities, however, are constituted through interdependent infrastructural, regulatory, ecological, and behavioural systems that cannot be reduced to a single analytical logic.
To address this gap, several research laboratories have developed data-driven approaches that integrate computation with spatial reasoning. The field of urban computing (Zheng et al., 2014) emerged precisely to bridge computer science and urban studies. More recent initiatives—such as the MIT Senseable City Lab, Beijing City Lab, ETH Future Cities Laboratory, and the Stanford Urban Informatics Lab—have extended this agenda through multimodal urban analytics. As Zou et al. (2025) emphasize, no single dataset can capture the full complexity of territorial systems; meaningful insights require the integration of spatial, environmental, textual, and behavioural data. Yet as Long and Zhang (2024) observe, many of these approaches remain oriented toward prediction, optimisation, or simulation, with limited emphasis on interpretive reasoning.
Selection logic and comparative framework
The following laboratories were selected as representative case studies based on three criteria:
institutional influence within computational urban research,
methodological diversity across data modalities and analytical tasks, and
demonstrated application of AI or data-driven methods to spatial and urban questions.
Each laboratory was examined through a structured review of key publications, coding their dominant data modalities, analytical techniques, and primary output types. This comparative approach provides a basis for identifying methodological patterns and gaps across contemporary urban AI research.
Beijing City Lab: Policy-embedded spatial modelling
Research at the Beijing City Lab demonstrates a strong connection between spatial analysis and planning frameworks. Early work by Long et al. (2012) introduced constrained cellular automata and regional sensitivity analysis to translate planning parameters into spatial outcomes. This approach revealed how planning intentions can operate as testable spatial codes.
Lang et al. (2018) expanded this perspective by combining spatial entropy, dissimilarity indices, and structural equation modelling, framing the city as a dynamic equilibrium between order and variation. Subsequent research incorporated deep learning and street-view imagery to operationalise visual cognition (Chen et al., 2023; Ma et al., 2019, 2025), transforming perceptual qualities into computationally readable datasets.
Parallel studies used segmentation and classification to map vacant and abandoned structures, revealing cycles of urban growth and decay (Li et al., 2023; Mao et al., 2022). More recent work integrates environmental sensing with behavioural data to analyse adaptive urban dynamics, including soundscapes and post-pandemic public spaces (Li et al., 2024; Zhang et al., 2023). Across these studies, the laboratory progressively moves from predictive modelling toward multimodal pattern interpretation.
ETH Future Cities Laboratory: Integrated urban intelligence
The ETH Future Cities Laboratory follows a parallel but distinct trajectory, moving from computational visualisation toward integrated urban intelligence. Zeng et al. (2016) employed visual analytics to translate complex transportation data into interpretable visual structures. Sun and Axhausen (2016) advanced this work through probabilistic tensor factorisation, uncovering hidden mobility patterns across large datasets.
Later research integrated generative and evolutionary algorithms into architectural design processes, allowing AI to participate in design decision-making (Koenig et al., n.d., 2020). Chadzynski et al. (2023) synthesised these directions through a dynamic geospatial knowledge graph, combining semantic and spatial reasoning within a continuously updated urban model. Collectively, ETH’s work demonstrates how analytical modelling, visualisation, and design-oriented computation can operate within multi-scalar urban systems.
MIT Senseable City Lab: Multimodal perceptual urbanism
The MIT Senseable City Lab has long pioneered perceptual and multimodal approaches to computational urbanism. Kan et al. (2020) coupled space-time GIS with vehicular modelling to transform emissions and traffic data into interpretable spatiotemporal structures.
More recent research explores human–AI collaboration in planning contexts. Zheng et al. (2025) proposed frameworks in which large language models interact with visual and behavioural data, enabling shared decision-making processes between human designers and computational systems. Guo et al. (2025) further extended this line of inquiry by applying AI-driven image clustering to measure the visual identity of urban spaces, effectively translating perceptual qualities into quantifiable datasets. Together, these works conceptualise the city as a layered informational environment, where environmental, behavioural, and perceptual data require different forms of analysis and interpretation.
Stanford Urban Informatics Lab: Transparent energy modelling
At the Stanford Urban Informatics Lab, research led by Roth et al. (2019, 2020) demonstrates how open data, machine learning, and physics-based simulation can converge to analyse urban energy systems. Their Urban Building Energy Model (UBEM) and its extended version (A-UBEM) generate synthetic hourly energy profiles for over a million buildings in New York City, combining random forest algorithms, convex optimisation, and Monte Carlo validation.
Beyond technical performance, this work emphasises transparency and accessibility. By building replicable and open models, the research highlights the importance of data openness and methodological clarity in urban AI. It suggests that the intelligence of cities depends not only on the sophistication of analytical models but also on the accessibility of the data infrastructures supporting them.
Comparative observations
Across these laboratories, a common trajectory emerges: a shift from isolated data analysis toward more integrated and multimodal forms of urban computation. However, their primary outputs remain oriented toward prediction, simulation, classification, or visualisation. While these approaches provide powerful analytical tools, they rarely offer a unified framework for interpreting spatial patterns across heterogeneous modalities and institutional timelines.
In other words, existing laboratories produce valuable fragments of urban intelligence—visual perception, mobility structures, energy systems, or planning simulations—but these insights often remain methodologically separated. What is still lacking is a framework capable of aligning these diverse modalities into a coherent interpretive system.
Positioning the City Decoder
The City Decoder is proposed in response to this methodological gap. It builds on the interpretive tendencies observed across these laboratories—policy-linked modelling at Beijing, integrated reasoning at ETH, multimodal perception at MIT, and open-data transparency at Stanford—but seeks to articulate them within a single, transferable framework.
Rather than focusing on prediction or generative design, the City Decoder emphasises multimodal alignment, spatiotemporal pattern recognition, and evidence-linked interpretation. Its goal is not to forecast urban futures, but to reveal the layered relations already embedded within urban environments by connecting spatial data, infrastructural systems, and regulatory sequences into a unified analytical field.
In this sense, the City Decoder extends the trajectory of contemporary urban computing by repositioning artificial intelligence as a tool for decoding relational urban patterns, supporting architectural reasoning and policy interpretation rather than purely predictive or generative tasks.
Synthesis
The City Decoder stands at the centre of this research as both a conceptual and methodological proposal: a multimodal spatiotemporal pattern-recognition framework designed to decode the relational logics embedded within complex urban environments. Its objective is to make apparent patterns that emerge from the interaction of spatial, legal, ecological, infrastructural, and socio-economic systems across time. These patterns often remain difficult to interpret through conventional architectural or urban analysis, as they are produced through overlapping processes that extend beyond the limits of single datasets or disciplinary lenses. The City Decoder therefore, does not aim to predict or generate future states; instead, it seeks to recognise, align, and interpret the relational structures that shape territorial transformation.
In both architecture and computational research, a model functions as a structured abstraction that organises complexity into an interpretable form. In computational logic, models mediate between data and meaning, transforming information into recognisable structures. In architecture, models act as relational constructs, linking spatial systems, actors, and temporal processes. The City Decoder adopts this dual understanding. It operates as a decoding interface—a methodological environment in which heterogeneous datasets, regulatory frameworks, and spatial traces are aligned and interpreted together. Its value lies not in simulating reality, but in clarifying it: exposing how infrastructural, institutional, and spatial systems intersect across time.
At its core, the Decoder treats the city as a layered and coded environment, composed of multiple interacting systems. Urban space is understood not as a stable object but as the product of continuous negotiation among overlapping temporalities: legal time, when regulations are enacted; infrastructural time, when systems are built or modified; ecological time, when environmental processes adapt; and social time, when patterns of occupation and use emerge (Koselleck, 2002). These temporal layers are read together rather than in isolation, allowing the framework to connect decisions, actions, and spatial outcomes. For example, a regulatory change can be traced through subsequent infrastructural developments and ecological effects, revealing how policy decisions translate into material transformations. Figure 2 illustrates the City Decoder’s layered structure and the relationships between these temporal domains.

City decoder layered structure.
Operational structure
Methodologically, the City Decoder is specified as a multimodal spatiotemporal pattern-recognition pipeline composed of three main stages: data alignment, relational analysis, and interpretive output generation.
The first stage involves the collection and alignment of heterogeneous datasets within a shared spatial–temporal reference. These inputs are deliberately diverse, reflecting the layered nature of urban environments. Spatial datasets provide the physical substrate, including parcels, infrastructural networks, ecological zones, and administrative boundaries. Raster imagery, such as satellite or aerial data, captures temporal changes in land use, vegetation, and construction. Textual and regulatory sources—planning documents, legal codes, environmental assessments—introduce institutional and normative dimensions. Historical archives reconstruct longer genealogies of territorial transformation, while socio-economic indicators provide additional behavioural and demographic correlations. Through spatial referencing and temporal indexing, these datasets are aligned into a unified analytical frame.
The second stage performs relational pattern analysis across these aligned layers. Rather than analysing each dataset independently, the framework searches for cross-layer relationships and temporal correspondences. Three primary analytical operations structure this process.
First, recurrence detection identifies spatial configurations or transformations that repeat across time or across different contexts. Second, cross-layer coupling analysis examines how changes in one domain—such as regulatory decisions or infrastructural interventions—correlate with spatial or ecological transformations. Third, discontinuity detection identifies moments where expected relationships break down, revealing gaps between policy intention and material outcomes, or between infrastructural logic and social use.
Through these operations, the Decoder identifies relational patterns that would remain invisible within isolated datasets.
The third stage translates these analytical results into interpretive outputs designed for architectural and planning reasoning. Rather than producing deterministic predictions, the system generates a set of diagnostic representations. These include pattern atlases that document recurring territorial configurations, relational mappings that illustrate interactions between spatial, legal, and infrastructural systems, and diagnostic reports that highlight inconsistencies or tensions within territorial processes. Such discontinuities—where regulations fail to produce intended spatial effects, or where infrastructural systems produce unintended consequences—are treated not as errors, but as critical entry points for interpretation.
Illustrative application: Kartal District, Istanbul
The present paper proposes City Decoder as a methodological framework specification rather than a fully implemented computational system. To demonstrate that the framework’s analytical logic is both operable and productive when applied to real data, this section presents a structured illustrative application conducted manually across three heterogeneous data layers for the Kartal district on Istanbul’s Asian shore as presented in Figure 3. The application does not claim to replicate what an implemented City Decoder system would produce at scale; it demonstrates, using traceable and publicly available data, the kind of relational patterns and interpretive outputs the framework is designed to surface. The site was selected because it concentrates, within a single compact district, the very conditions the framework is designed to decode: overlapping regulatory regimes, a building stock of sharply divergent ages and construction types, a documented seismic risk profile, and a persistent gap between governance intention and material outcome.

Kartal district in Istanbul.
Data alignment
The first stage of the City Decoder pipeline requires the collection and alignment of heterogeneous datasets within a shared spatial–temporal reference. For the Kartal, three primary layers were assembled and co-registered. The spatial layer draws on the Istanbul Metropolitan Municipality’s open building inventory (Istanbul Metropolitan Municipality [IBB], 2023), comprising approximately 30,000 building records classified by construction era, structural type, floor count, and neighbourhood. The regulatory layer draws on the municipal zoning register and the 1947 industrial zoning designation that formally structured Kartal’s western neighbourhoods, subsequently overlaid by Gecekondu Law No. 775 (1966), the 1980 prevention zone classification, and Urban Transformation Law No. 6306 (2012). The risk layer draws on the seismic scenario developed by Erdik et al. (2003) for the Istanbul Metropolitan Area, adopted as the operative legal basis for transformation decisions, and the building-level damage estimates published by the Istanbul Metropolitan Municipality Earthquake Scenario Analysis (IBB, 2023). These three layers—spatial, regulatory, and risk—were indexed against a shared temporal reference spanning 1945 to 2023, with population census records from TÜİK (1945–2022) providing a fourth socio-demographic layer confirming the pace of settlement relative to regulatory action. Figure 4 presents the resulting land use and urban fabric structure of the Kartal district, showing the spatial co-presence of residential, industrial, and transitional zones alongside neighbourhood boundaries and infrastructure networks.

Kartal district: Urban fabric and land use structure.
Cross-layer coupling analysis
The second analytical operation examines how changes in one domain correlate with transformations across others. Aligning the regulatory and spatial layers reveals a pattern of coupling that conventional single-modality analysis cannot surface. Kartal’s population expanded from approximately 21,000 in 1945 to over 572,000 by 1985—a 27-fold increase over four decades driven by industrial migration (TÜİK, 1945–2000). This growth occurred overwhelmingly on land that remained formally zoned as industrial or undesignated under the 1947 classification. The building inventory records that of approximately 30,000 structures in Kartal, 9,211 are coded to construction era 1 (pre-1980) and 14,491 to era 2 (1980–2000)—together representing approximately 79 per cent of the building stock, the majority of which is reinforced concrete (Betonarme, ~25,500 structures) but with a significant minority of unreinforced masonry (Yığma, ~3,887 structures) concentrated in the neighborhoods that grew earliest and fastest. Cross-layer coupling analysis reveals that the spatial distribution of Yığma structures is not random: it is strongly correlated with the areas of earliest informal settlement, those where rapid population growth preceded both regulatory coverage and structural standards. The regulatory layer confirms that the 1947 industrial zoning designation remained formally unrepealed throughout this settlement process, producing a legal condition in which the buildings that came to house the majority of Kartal’s population were constructed on land whose designated use had never been revised to reflect residential occupation. This is the coupling pattern the City Decoder’s cross-layer analysis is designed to detect: the co-evolution of spatial form and regulatory status across time, and the structural tensions their divergence produces.
Discontinuity detection
The third operation identifies moments where the expected relationship between policy intent and material outcome breaks down. In Kartal, the most analytically significant discontinuity concerns seismic governance. The operative risk evidence base for transformation decisions across Kartal is the 2003 site-dependent deterministic intensity distribution developed by Erdik et al. (2003) for the Istanbul Metropolitan Area (Erdik et al., 2003), which classifies the district’s western and central neighbourhoods at seismic intensity 7–8 and portions of the coastal edge at 8–9 under an Mw 7.4 scenario. This assessment, now over two decades old, remains the legal basis on which Urban Transformation Law No. 6306 applications are assessed and prioritised. The gap between the 2003 calibration date of this risk instrument and the urban condition it currently governs represents a measurable discontinuity between regulatory intent and material reality: the instrument was designed for a building stock and settlement pattern that has continued to evolve since its calibration, while the governance decisions it authorises remain indexed to a fixed analytical baseline. The spatial distribution of seismic damage estimates across Kartal’s neighbourhoods illustrates the material consequence of this discontinuity directly. Figure 5 maps the expected severely damaged buildings per neighbourhood under the Mw 7.5 scenario (IBB, 2023): Orhantepe records the highest concentration (13–22 severely damaged buildings per unit), followed by Hürriyet and Yunus. These are the neighbourhoods whose building stock is most densely composed of pre-1980 construction, most heavily associated with informal settlement, and most exposed to seismic intensity 8–9. The discontinuity between what was regulated and what was built, sustained over eight decades, has produced a measurable risk gradient whose spatial logic is only legible through the simultaneous alignment of the regulatory, spatial, and risk layers.

Kartal district: Seismic damage distribution and building stock vulnerability under Mw 7.5 Scenario.
Pattern Atlas entry
The interpretive output of this application takes the form of a prototype Pattern Atlas entry for Kartal, one of the diagnostic representations the City Decoder is designed to produce. The entry registers four relational findings. First, a recurrence pattern: the spatial concentration of vulnerable building stock in the western neighbourhoods is not a random artefact of urban growth but a recurring outcome of the structural coupling between early informal settlement, regulatory non-coverage, and construction-period constraints. Second, a cross-layer coupling pattern: population density, construction vulnerability, and seismic exposure are co-located precisely in the zones where regulatory intent diverged earliest and most persistently from settlement reality. Third, a discontinuity: the seismic governance instrument calibrated in 2003 continues to authorise transformation decisions for a building stock and risk profile that have evolved across the two decades since its adoption, producing a regulatory lag whose spatial consequences are measurable at the neighbourhood scale. Fourth, a governance collision: the co-presence of the 1947 industrial designation, gecekondu cadastral geometry, and the 2012 transformation law has produced irresolvable property-rights conflicts in the zones of highest vulnerability, stalling precisely the interventions the regulatory framework was designed to enable. Together, these four findings constitute the kind of relational, evidence-linked diagnostic output the framework is designed to generate—not a prediction of what Kartal will become, but an interpretive account of the structural logics that have produced its current condition and that constrain every governance decision taken within it.
This illustrative application demonstrates that the City Decoder’s three analytical operations—data alignment, cross-layer coupling, and discontinuity detection—are not only conceptually specified but analytically productive when applied to traceable, publicly available data. The computational implementation of the framework would extend this logic to larger territorial scales, additional data modalities, and automated pattern detection across comparable urban contexts, producing the kind of systematic and scalable interpretive intelligence the present specification anticipates.
Epistemological position
The Decoder’s epistemological position is defined by recognition rather than prediction. It does not attempt to forecast urban futures or generate optimised forms. Instead, it operates as a diagnostic instrument that reveals the underlying structures through which territories evolve. Its function is to render legible the interactions between space, time, governance, and infrastructure, transforming fragmented datasets into coherent interpretive fields.
By aligning multiple systems within a single analytical framework, the Decoder constructs what can be described as a relational grammar of the territory—a set of recurring configurations through which spatial transformations occur. This grammar is neither fixed nor universal; it is context-dependent and continuously evolving, reflecting the adaptive nature of urban systems. The model, therefore, emphasises situated interpretation rather than universal prediction.
Transferability and scope
The City Decoder is conceived as a transferable framework rather than a site-specific tool. Its structure allows it to operate across different territorial scales, from metropolitan regions to infrastructural corridors, without losing analytical coherence. Its scalability lies not in computational magnitude alone, but in its conceptual capacity to align diverse temporalities and spatial systems within a unified interpretive logic. By working through recognition rather than generalisation, the framework respects local specificity while enabling comparative readings across contexts.
Contribution of the framework
Within the broader landscape of computational urban research, the City Decoder proposes a shift from predictive and generative paradigms toward interpretive, traceable pattern recognition. It transforms artificial intelligence from a tool primarily associated with forecasting or simulation into a method for understanding how spatial transformations occur across layered systems. In this sense, decoding becomes both an analytical operation and a methodological stance: an attempt to make visible the relational structures that organise contemporary urban environments.
Rather than producing singular solutions, the framework offers a structured way of reading complex territories. It aligns data, law, and spatial form into a shared analytical field, enabling architects, planners, and researchers to identify patterns, tensions, and discontinuities within urban systems. The City Decoder thus positions recognition as the foundation for a new mode of spatial reasoning—one that seeks not to predict the future of cities, but to render their existing logics legible.
Conclusion
This paper has proposed the City Decoder as a methodological framework for advancing multimodal pattern recognition within architectural and urban research. Conceived as a spatiotemporal decoding system, the framework responds to the limitations of predominantly predictive and generative AI approaches by repositioning artificial intelligence as an instrument of spatial interpretation rather than production. Through the alignment of heterogeneous datasets—including geospatial layers, regulatory texts, archival records, and socio-behavioral indicators—the City Decoder establishes an interpretive pipeline capable of revealing the relational processes through which urban environments are continuously structured and restructured.
The comparative analysis of global city laboratories demonstrates both the analytical potential and the methodological fragmentation of current computational approaches to urban analysis. While these laboratories have developed advanced techniques for measurement, simulation, and visualisation, their outputs often remain modality-specific or oriented toward prediction. Against this landscape, the City Decoder proposes a unifying framework that foregrounds decoding as both a conceptual and operational act. By aligning spatial, regulatory, infrastructural, and ecological datasets within a shared spatiotemporal reference, the framework enables recurrence detection, cross-layer coupling analysis, and the identification of discontinuities between policy intentions and material outcomes.
The study makes three primary contributions. First, at a theoretical level, it articulates a shift from predictive and generative paradigms toward interpretive, traceable pattern recognition, reframing the role of artificial intelligence within architectural and urban inquiry. Second, at a methodological level, it introduces a transferable multimodal pipeline that specifies inputs, alignment procedures, analytical operations, and interpretive outputs for decoding complex urban systems. Third, at a practical level, it demonstrates the framework’s analytical logic through an illustrative application to the Kartal district, Istanbul, showing how data alignment, cross-layer coupling analysis, and discontinuity detection surface relational patterns—including a 21-year seismic governance lag and a structural coupling between informal settlement history and present-day vulnerability—that no single-modality analysis could produce.
Rather than forecasting urban futures or generating optimised forms, the City Decoder operates as a diagnostic instrument. Its primary function is to reveal the layered logics already embedded within urban territories, tracing how decisions, infrastructures, regulations, and everyday practices interact across time. In doing so, it positions recognition as a critical mode of design intelligence, one that complements existing predictive and generative tools by providing interpretable knowledge about the structures that shape spatial transformation.
Ultimately, the City Decoder reframes artificial intelligence as a medium of spatial understanding. By transforming fragmented datasets into coherent interpretive fields, it offers a methodological foundation for reading cities as coded, relational environments. This approach does not seek to replace existing computational methods, but to extend them toward a more interpretive and architecturally grounded mode of reasoning—one capable of making the hidden structures of urban space visible and actionable.
Footnotes
Acknowledgements
The author acknowledges the organisers of The 2025 International Conference on China Urban Development: Governing Urban Development in a Changing World for the invitation to present, and for the opportunity to receive the invitation to publish in this journal. Special thanks are due to Fangzhu Zhang for her guidance during the conference and her continued support throughout the publication process. The author also extends sincere gratitude to Michele Bonino and Lai Yuan for their supervision of the ongoing PhD research and their invaluable support, and to Edoardo Bruno for joining the conference and co-presenting our collaborative work. The author further thanks Buğra Bozkurt, Lendi Osmani, Mehmet Derin İncekaş, Joseph Junior Obeng, Saurajeeta Bose Paul, and Giorgia Greco for their encouragement and support throughout the publication process. Finally, a heartfelt thank you to my family, whose unwavering love and support have been a constant source of strength throughout this journey.
Ethical considerations
This research did not involve human participants, clinical trials, animals, or sensitive personal data. The study is based solely on publicly accessible sources, archival materials, geospatial datasets, and academic literature. Therefore, no institutional ethical approval was required.
This study complies with all ethical standards relevant to architectural, urban research, and computational analysis. No activities were conducted that required Institutional Review Board (IRB) or ethics committee approval.
Funding
The author disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: all work was conducted independently by the author as part of ongoing doctoral research activities at Politecnico di Torino and Tsinghua University.
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
All data used in this research are publicly available, non-sensitive, and derived from open-access academic literature, publicly accessible geospatial datasets, and policy documents. The illustrative application in Section 4 draws on the Istanbul Metropolitan Municipality open building inventory (IBB, 2023), the seismic risk assessment by
, municipal zoning records, and population census data (TÜİK, 1945–2022), all of which are publicly available institutional sources. No proprietary or restricted datasets were used. All referenced materials are cited in the bibliography.
