Abstract
This study examines the limitations of traditional population synthesis models, which often neglect workplace location in population forecasting and generate independent projections across forecast years. To enhance forecasting accuracy, an extended population synthesis model is introduced, integrating the job-housing origin-destination (OD) matrix and individuals’ willingness to change jobs and residences. The model incorporates these factors to produce a more dynamic and realistic representation of population distribution and mobility trends. Developed using the existing job-housing OD matrix and transition willingness data, the model initially synthesizes population data through the Iterative Proportional Updating (IPU) algorithm. It then applies the OD matrix as a constraint, employing probability sampling without replacement to assign workplaces to workers, ensuring consistency with actual traffic analysis zone (TAZ) statistics. An attribute database of residential and workplace locations is established. Using survey data on job and residential mobility preferences, characteristic parameters are calibrated, and a database of transition tendencies is created. Leveraging outputs from the enhanced population synthesis model and transition propensity database, a population transition model is constructed, generating annual population projections based on stock and flow perspectives. Empirical analysis demonstrates that the model effectively tracks changes in job and residential locations, maintaining spatial-temporal continuity and providing a robust foundation for studying commuting and travel behavior.
Keywords
Introduction
Classic population synthesis models have played a crucial role in simulating and analyzing population characteristics for applications such as urban planning, transportation modeling, and policy-making. Despite their contributions, these models have notable limitations, particularly in representing spatial and temporal dynamics. Specifically, they often fail to account for individuals’ work characteristics and lack continuity across different forecast years, which impedes their utility in accurately capturing real-world population behaviors (Farooq et al., 2013).
A significant shortcoming of classic population synthesis models is their failure to incorporate the work characteristics of individuals. These models typically generate synthetic populations based on household demographics without considering employment factors, such as place of work, industry type, or commuting patterns (Prédhumeau and Manley, 2023). This omission leads to an incomplete understanding of the interaction between residential and employment locations, which is critical for urban planning and transportation modeling (Thondoo et al., 2020). For instance, accurate modeling of commuting patterns requires individuals to have assigned work locations that reflect actual urban and regional employment dynamics. Without these linkages, planning decisions, such as public transit infrastructure development, are likely to be based on unrealistic assumptions regarding the spatial relationship between residential and work areas, leading to suboptimal outcomes (Duranton and Guerra, 2016; Hörcher and Tirachini, 2021). The lack of employment data thus undermines the model’s capacity to accurately reflect land use and labor market dynamics, ultimately reducing the validity of its predictions (Elsby and Gottfries, 2022).
Another limitation of classic population synthesis models is their lack of temporal consistency. The synthetic populations produced for different forecast years are essentially independent snapshots, meaning individuals are not connected across time regarding their demographic evolution or spatial movement (Chapuis et al., 2022). This “snapshot” nature disregards the evolution of households and individuals over time, leading to inconsistencies in longitudinal analyses. For example, understanding the effects of economic policies or urban development projects requires tracking individuals and households over time to determine the outcomes of those interventions (Kühnel et al., 2024). Classic models’ inability to accurately capture dynamic behaviors, such as aging, migration, job transitions, or residential relocation, prevents a comprehensive analysis of long-term trends and policy impacts. Consequently, the insights derived from these models may lack the robustness required for effective decision-making in urban planning and policy development.
To address these limitations, recent advancements in population synthesis have focused on incorporating both spatial (e.g., place of work) and temporal linkages. Enhanced models, such as those employing microsimulation and agent-based modeling approaches, strive to create a more holistic representation of individuals, linking residential and workplace characteristics while maintaining continuity over time (Chingcuanco and Miller, 2012). For instance, agent-based models have been shown to more accurately represent population dynamics by simulating individual-level interactions and decision-making processes, allowing for a more realistic depiction of commuting behaviors and residential choices (Bastarianto et al., 2023).
Moreover, tools like PopGen, Populationsim, MATSim, EMME, and TransCAD have emerged as practical solutions to address some of the limitations of classic population synthesis (Tianran et al., 2023). PopGen is an open-source population synthesis tool that generates synthetic populations based on observed demographic data and provides flexibility for incorporating spatial and temporal factors. Populationsim, on the other hand, is designed to create consistent and reproducible synthetic populations for different forecast years, effectively maintaining temporal continuity and enhancing model robustness. MATSim is an agent-based transport simulation framework that allows for detailed modeling of individual travel behavior, providing insights into the interactions between population synthesis and transportation systems (Horni et al., 2009). EMME and TransCAD are widely used transportation planning software tools that integrate population synthesis with transportation demand modeling, allowing for effective evaluation of transport policies and infrastructure investments. These tools have significantly improved the ability to capture population dynamics in greater detail, thus providing more accurate insights into transportation needs, housing policies, and economic development strategies (Boeing and Waddell, 2017).
All of these commercial software packages and open-source tools can generate household- and individual-level attribute tables based on TAZ. However, they fail to incorporate the linkage between where employed individuals live and where they work, even though job-housing origin-destination (OD) matrices have become a fundamental component of LBS data analysis in the era of big data. It is important to underscore that there exists a causal relationship between the residences and workplaces of the working population and the origins and destinations of commuting activities. In the four-step travel demand model, trip production and attraction are calculated from population and employment (land use) data, followed by trip distribution estimation through gravity models or other methods. However, this approach to commuting trips does not constitute causal inference. In contrast, activity-based models (ABMs) introduce a workplace location choice module, which appears to offer a form of causal inference for commuting predictions. Yet, because workplace location choice is modeled using a utility maximization function, the consistency of the resulting residential–workplace relationships with real-world conditions remains to be validated. Furthermore, when forecasting from current conditions to future scenarios, the resulting travel estimates for each target year remain independent of one another, as they do not consider the continuity of population distribution and job-housing relationships. In reality, cities evolve gradually, and behaviors such as moving house or changing jobs help bridge the gap between present and future, providing a critical variable that links existing conditions to forecasts and endows commuting activity predictions with causal explanatory power.
This paper presents a novel model that enhances traditional population synthesis by integrating job-housing data and a household travel survey. The Iterative Proportional Updating algorithm is employed to create comprehensive databases and to anticipate shifts in work and residential locations. This approach enhances the precision of commuting demand forecasts through spatio-temporal analysis. The remainder of this paper is organized as follows: The next section provides a concise overview of the classical IPU algorithm for population synthesis. This is followed by an outline of the characteristics of the survey on individuals’ willingness to relocate and change employment, providing crucial parameters for conducting stock and flow-based population projection studies. The following section presents an extended spatio-temporal evolution population model and its principal functions and algorithms. The next section describes the characteristics of the new model and the improvement effect on the traditional model in conjunction with an empirical application. Finally, we summarize the features and advantages of the new model and outline its shortcomings and potential avenues for improvement.
Conventional approach
Iterative Proportional Updating (IPU) is the classical method for population synthesis, and it is an extension of the Iterative Proportional Fitting (IPF) algorithm, designed to synthesize populations when there are multiple levels of constraints that need to be met. While IPF is well suited for adjusting weights based on a single set of marginal distributions, IPU extends this capability to situations where there are multiple sets of hierarchical constraints, such as household-level and individual-level attributes, which need to be fitted simultaneously.
IPU is widely used in synthetic population generation, particularly in contexts such as travel demand modeling and microsimulation, where it is essential to balance both household-level and person-level attributes to accurately represent a target population.
Characteristics of the change from IPF to IPU
Iterative Proportional Fitting (IPF) is generally used to fit multidimensional tables to match marginal totals by adjusting cell values iteratively. However, it becomes limited when the problem requires balancing different hierarchical constraints. For instance, fitting household characteristics and individual characteristics simultaneously necessitates an additional level of complexity beyond what IPF can handle.
Iterative Proportional Updating (IPU) addresses this limitation by considering household-level and person-level constraints in an integrated framework. It effectively balances households (e.g., income, size) and individuals (e.g., age, employment status) such that all specified control totals match their respective targets.
IPU algorithm implementation process
Below is an overview of the IPU algorithm, including the specific steps and formulas involved:
1. Assign initial weights
Define control totals:
2. Iterative updates:
IPU works by updating the weights of households iteratively to meet both household-level and person-level constraints. At each iteration
Adjust household-level weights: The household-level weights are adjusted to match the target household-level marginal distributions.
where:
Adjust person-level weights: After adjusting for household-level targets, the next step is to adjust the household weights to ensure the individual-level marginal distributions are also satisfied.
where:
Convergence criteria:
The iterative process continues until convergence is achieved. Convergence is typically defined based on the total difference between the target control totals and the estimated totals falling below a specified tolerance (
where
IPU is highly suitable for synthetic population generation when there are hierarchical relationships between units (e.g., households and individuals). This allows for a more precise matching of real-world population structures in various applications, such as urban planning, travel demand forecasting, and other social science research contexts.
Willingness survey to change jobs and housing
The intention of the willingness survey was to ascertain the characteristics of citizens’ change of residence and change of job. In particular, the survey sought to identify the current housing characteristics of the urban population, including whether the accommodation was owned, rented, provided by an employer, or constituted some other form of tenure. It also aimed to determine the willingness to move and the preferences for the type of accommodation, as well as the willingness to change jobs.
Housing type of base year
In the 2017 Household Travel Survey conducted in Guangzhou (CHEN et al., 2021), housing type data were collected from a total of 82,000 households. There are four main types of housing: owned, rented, employer-provided, and other. From the results, the percentage of owned housing was 82.7% and the percentage of rented housing was 15.9%. Based on the age of the householder, the head of household was categorized into four generations: Baby Boomers+, Generation X, Millennials, and Generation Z (Table 1).
Table with age ranges.
The structure of the composition of housing types for each age group is shown in Figure 1. Overall, the trend shows that as the age of the householder increases, the share of owner-occupied housing increases significantly, while the share of renter-occupied housing decreases. Baby Boomers+ own up to 92.7% of their homes, which means that their residence is becoming more stable, while the proportion of renters is as high as 47.5% for Generation Z and 23.6% for Millennials, which means that Generation Z and Millennials are more willing to move.

Share of current housing type mix by generation.
Willingness to change jobs
In the 2019 supplementary survey of the Guangzhou Household Travel Survey (JIN et al., 2020), inquiries were made regarding the willingness of 10,000 household heads to change jobs, with a sampling rate of households of about 0.2%. The primary findings exhibited the following characteristics. Regarding the willingness to change jobs (Figure 2), we found a stronger willingness to change jobs among those who have been working for 3 years, and a decrease in willingness to change jobs as continuous tenure in the same position increases. The proportion of people who have worked for 10–20 years or more with the intention of changing jobs is 5.6%, while the proportion of people who have worked for 20 years or more is 1.2%, which is quite stable. Those who have worked in the same position for 1–3 years have the strongest willingness to change jobs, accounting for about a quarter of the population, and the reasons for this are, first, that they have completed their accumulation by working for a certain period of time and want to be promoted by changing platforms; and second, a certain period of time has allowed them to understand whether the job is suitable for them or not.

Share of willingness to change job by length of tenure.
Housing was quantified based on indicators such as facility convenience, level of supportive schools, whether it is a new building, metro accessibility, and housing prices, and a housing choice modeling study was conducted in conjunction with the results of a population preference survey. After surveying a sample of 264 single people, 103 newly married couples, 147 new parents, and 239 full families (two adult generations or three-generation families) and evaluating as low, medium, or high the relative convenience of facilities, school district, new building, metro accessibility, and housing prices, the following criteria were taken into consideration. These criteria were quantified by assigning 1, 3, or 5 points, respectively, to low, medium, and high (Chen et al., 2024; Yang, 2020).
1) The facility convenience criterion is evaluated based on the occupancy rate. A rate of less than 40% is considered low, a rate of 40–80% is considered medium, and a rate of more than 80% is considered high. Municipal schools and provincial schools were classified as low, medium, and high, respectively.
2) New buildings were categorized as follows: 20+ years, 5–20 years, and 1–5 years. These correspond to low, medium, and high, respectively.
3) Metro accessibility were classified as follows: 2 km+, 800–2000 m, and 800 m or less from the metro station. These correspond to low, medium, and high, respectively.
4) House price was classified as low, medium, or high according to the unit price. The low category includes units priced below 50,000 yuan/square meter, the medium category includes units priced between 5,000 and 10,000 yuan/square meter, and the high category includes units priced above 100,000 yuan/square meter.
Finally, the preference matrix of different types of populations was obtained, as shown in Table 2. With a score of 4, the single population presents a preference for metro and low housing prices; newly married couples present a preference for community maturity, highly qualified school district, and metro accessibility; new parents present a preference for educational resources only; and full families have a high propensity for community maturity. The composition of preferences for each demographic category is shown in Figure 3.
Preference coefficients for choosing where to live for different groups.
Note: Figures in italic represent high values, and those in bold represent low values.

Choice preference of key elements among different groups: (a) single people; (b) new couples; (c) new parents; (d) full family.
Proposed algorithm and implementation considerations
In order to fully consider the influence of the influence of the city’s current population on the future population forecast, this paper presents a spatio-temporal evolution population forecasting model framework comprising three parts: the extended population synthetic model of the base year, the life cycle model and marriage and fertility model, and the population transition model. The extended population synthesis model is employed to construct a comprehensive population database for the base year, encompassing data on the occupations of the employed and the educational institutions attended by primary and secondary school students. The population evolution model is utilized to forecast the natural growth, marriage, and reproduction. The job-housing transition model is employed to delineate the population change in terms of stock and flow. The final results are utilized to obtain the full-sample population characteristics database for each year, which can be retrospectively applied in both temporal and spatial contexts. The comprehensive process is illustrated in Figure 4.

The spatio-temporal evolution population forecasting model framework.
Extended population synthesis model of base year
The extended population synthesis model incorporates the workplace of employed individuals and the school location of students, extending the classical population synthesis model. The total population is obtained from the census data, the total employed population from the economic census data, and the total number of households and individuals from the household travel survey data. Furthermore, the place of employment of workers is considered, and a probabilistic sampling model is designed to assign a workplace to each working population, with the job-house OD matrix serving as a constraint. The job-house OD matrix is derived from a long-term, large-scale LBS data projection. The education authority is the source of the school information and school district delineation information assigned to each student.
Population evolution model
The model of population evolution encompasses the entire process of human reproduction, including birth, death, marriage, and fertility. Furthermore, the model delineates the familial relationships of each individual within the urban population.
Job-housing transition model based on willingness survey
The job-house transition model introduces the concepts of stock and flow, which are fundamental to understanding the dynamics of labor markets. The stock represents the population that has neither relocated nor changed employment status. Those who change residence, those who change jobs, those who both move and change jobs, and those who move in and out are classified as flows. All change characteristics are obtained via a willingness survey. The model is capable of tracking the changes experienced by each individual, and, more importantly, it can align more closely with the actual status quo of the city, thereby enabling the inheritance and continuation of the prediction results regarding population attributes in both temporal and spatial contexts.
Compute household choice probability in TAZ
The probability of a person/family choosing to settle in a particular traffic analysis zone (TAZ) is based on their willingness to make housing choices and the availability of housing based on TAZs.
where
Application and result
The empirical evidence presented here is based on data from Huangpu District in Guangzhou, with 2020 established as the baseline and projections for 2021–2025. The primary objective is to synthesize the population data from 2020 in order to obtain the requisite household and individual attribute data, which is based on the TAZ. Based on the results of the population synthesis, a simulation of the life cycle and migration patterns of the population is conducted. Ultimately, the population, employment, and job-housing OD matrix for Huangpu District are generated for each year. In consideration of the actual population and its composition, a negative population growth is predicted to occur as a result, assuming 20,000 external migrants per year and new employment and new population jobs are assigned within the study area.
Key input data
In 2020, there were approximately 460,000 households in Huangpu District, Guangzhou City, with a total population of 1,329,000. Of these, 681,000 were male and 648,000 were female. The age structure of this population is illustrated in Table 3.
Population’S Age Struction of Huangpu District in 2020.
In order to record comprehensive demographic data, a personal portrait profile table (Table 4) comprising 24 fields of information pertaining to family, children, and work/education is constructed for each individual.
Full person’s attributes list.
Model validation
All findings demonstrated a high level of consistency with the parent sample. For instance, when modeling household numbers, both the total households, single person, 2-person, 3-person, 4-person, and 5-person or more achieved an R² of 1 (Figure 5).

Fitting of family-attribute boundary constraints in population simulation. (a) Total number of households. (b) Number of single person households. (c) Number of 2-person households. (d) Number of 3-person households. (e) Number of 4-person households. (f) Number of 5+person households.
Furthermore, we validated the simulation results of the job-housing OD matrix. The correlation coefficient (R) between the simulated values and the observed values at the TAZ level was 0.9989 (Figure 6), indicating that the model exhibits strong predictive capability.

Fitting of job-housing OD Matrix in TAZ.
Population synthesis forecast with memory
Upon completion of the entire program, a comprehensive report will be generated, providing detailed information for each individual (Table 5) with memory of the population’s place of residence and place of work.
Full person attributes list of the result example.
It is assumed that the fertility rate is 1.5 in 2021 and grows to 1.8 in 2025. The results demonstrate that from 2020 to 2025, there were 13,857 new births. A total of 18,533 deaths occurred in the study area, with a total population of approximately 13,245,444. This represents a decrease of approximately 5,000 deaths compared to the number recorded in 2020.
The simulation of the population forecast model indicates that over the five-year period, the existing population in the study area will experience a net movement of 79,796 individuals, representing a net change of approximately 5.9%. A total of 82,643 individuals, representing approximately 10.2% of the working population, changed jobs. Of these, 32,549, or approximately 2.5% of the existing population, relocated and changed jobs.
The results of the year-to-year changes demonstrate that a small number of people will move and that some will choose to change jobs. The proportion of the population changing jobs will be higher than the proportion of the population moving, which verifies the results reflected in the surveys on the willingness to move and to change jobs mentioned earlier. In conclusion, the urban system is not a fully variable dynamic system, but rather an inert system that changes slowly. The enhanced spatio-temporal evolution population forecasting model is grounded in the real job-housing OD matrix. This synthesizes personal information with workplace and housing attributes, combined with the willingness to change jobs and residences, to forecast the future. The model has the capability to replicate the current situation of the city, as well as to inherit and deduce future predictions of the current situation.
Summary and conclusions
The extended spatio-temporal evolution population model represents an extension of the classical population synthesis model. The population synthesis model is applied to simulate the household and population attributes of the current situation in each TAZ based on aggregate data such as statistical yearbooks as constraints, using the household and population seeds obtained from the household travel survey. A probabilistic sampling method is employed to establish a link between home and workplace, based on job-housing OD matrix obtained from LBS data. The data pertaining to school zoning is employed in order to establish a correspondence between students and schools. Ultimately, a database comprising the base year family population, with spatial links and family member relationships, is established. A matrix of preferences for different populations is used to score the type of residence controlled for and to create a selection set for the unused population. The probability sampling method is then employed to select the place of residence for the migrating population, which is subsequently assigned to the transportation district of their residence. A comparable approach is employed to determine the selection of workplaces for the population undergoing occupational transition. The model is based on the study of individuals and households, which allows for a more nuanced response to the impact of China’s evolving family planning policies.
The findings of the empirical study indicate that the novel model demonstrates superior capacity to accommodate the existing urban context, facilitating the extrapolation of spatial and temporal dynamics from the present to the future. This is achieved through an integrated approach that considers the established job-housing relationship, the evolving residential site selection patterns, and the evolving job-changing model. The research does not focus on the capacity of urban planning as the primary objective; rather, it considers the constraints imposed by this capacity. This approach allows for a reflection on the capacity of urban planning itself, as well as the rationality of real-world developments. In this way, the research addresses the dual constraints of ideal and rationality, thereby establishing a foundation for a more truthful approach to the development of urban planning. The results of the study also serve to reinforce the conclusion that the city is a relatively stable system, encompassing the stability of the place of residence, the stability of work, and the stability of the relationship between work and residence. These stability attributes constitute the foundation of urban activities.
The extended spatio-temporal evolution population model addresses the limitation of the classical population synthesis model in terms of the attributes of the workplace of the workers. Conversely, the model incorporates the inheritance of the urban status quo through the stock and flow, ensuring that the population prediction results are not independent of the urban status quo and can achieve continuity in time and space. This is more aligned with the transition from incremental planning to stock planning. Individual-based spatio-temporal attributes also furnish a more dependable data foundation for conducting actrivity based model research.
The limitations are mainly in the following areas. First, the job-housing OD matrices could not take into account occupational categorization. Second, the simulation method of probabilistic sampling is a commonly used technique in the modeling process of this paper. While this method effectively reduces the amount of model calculation, it also brings the difficulty of non-reproducible calculation results. Third, the characteristics of the willingness to change jobs and residences come from the status quo survey, which needs to be kept up to date in order to adapt to the ever-changing needs of the city. Finally, based on the model framework and data conditions, the model provides better feedback for short-term forecasts, but the reliability of the model for medium- and long-term forecasts needs to be further verified. Overall, population forecasting remains a challenging endeavor, especially with regard to the complexity of immigration and emigration. Further research is needed to gain a deeper understanding of these phenomena.
In conclusion, the proposed spatio-temporal evolution population forecasting model addresses significant shortcomings of traditional population synthesis methodologies by integrating job-housing relationships, spatial and temporal dynamics, and population transition mechanisms. Empirical findings demonstrate the model’s capacity to reflect urban stability and provide nuanced insights for urban planning and transportation analysis. However, the model’s dependency on up-to-date willingness surveys, its limitations in occupational categorization, and challenges in medium- to long-term reliability highlight areas for further improvement. Future research should focus on refining probabilistic sampling methods, enhancing dynamic job-housing matrices, and exploring mechanisms to better capture migration complexities. These developments will be pivotal in advancing the robustness of urban population forecasting models. Furthermore, the extended spatio-temporal evolutionary demographic model not only provides an accurate methodology for predicting population and urban dynamics (capturing work-housing relationships, stock and flow inheritance, and activity-based considerations), but also proposes a framework for predicting population development with memory and retrospective capabilities. It focuses on reflecting real-world constraints and stability while adapting to future shifts, fully utilizing big data conditions to develop population projections with place-of-work and place-of-residence attributes based on inheritance of the current state of the city, and also providing elements of causal explanatory power for commuter traffic analysis.
Footnotes
Acknowledgements
The authors would like to express their gratitude to Guangzhou Transport Planning Research Institute Co., Ltd for their invaluable support and contributions to this research. Special thanks to Prof. Chen Xiaohong for her guidance, and to Dr. Caixia Li, Xiaosheng Lin, and Wentao Shen for their assistance with data collection and program coding.
Author contribution statements
The authors confirm contribution to the paper as follows: conceptualization: Xianlong Chen, Guosheng Jing, Yulong Luo; methodology: Xianlong Chen, Guosheng Jing; modeling: Xianlong Chen; writing – original draft and review: Xianlong Chen; writing – review and editing: Yulong Luo; writing – review: Yulong Luo; supervision: Guosheng Jing. All authors reviewed the results and approved the final version of the manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Supported by Guangzhou Transport Planning Research Institute Co., Ltd (KYHT-2023-01).
Ethical considerations
This study did not require ethical approval.
Consent to participate
Informed consent was obtained from all participants verbally, and they were informed of their right to withdraw at any time.
Consent for publication
Verbal consent for publication was obtained from all participants.
Data availability statement
The data that support the findings of this study are owned by the Guangzhou Transportation Planning Research Institute Co., Ltd. Due to proprietary restrictions, the data are not publicly available. However, access to the data may be granted upon reasonable request and with permission from the Guangzhou Transportation Planning Research Institute Co., Ltd. For further inquiries, please contact the corresponding author.
