AITopics | population synthesis

Collaborating Authors

population synthesis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Prior-Fitted Functional Flow: In-Context Generative Models for Pharmacokinetics

Ojeda, César, Hartung, Niklas, Huisinga, Wilhelm, Jahn, Tim, Kavwele, Purity Kamene, Klose, Marian, Kumar, Piyush, Sánchez, Ramsés J., Faroughy, Darius A.

arXiv.org Machine LearningApr-21-2026

We introduce Prior-Fitted Functional Flows, a generative foundation model for pharmacokinetics that enables zero-shot population synthesis and individual forecasting without manual parameter tuning. We learn functional vector fields, explicitly conditioned on the sparse, irregular data of an entire study population. This enables the generation of coherent virtual cohorts as well as forecasting of partially observed patient trajectories with calibrated uncertainty. We construct a new open-access literature corpus to inform our priors, and demonstrate state-of-the-art predictive accuracy on extensive real-world datasets.

large language model, machine learning, trajectory, (20 more...)

arXiv.org Machine Learning

2604.1767

Country:

North America > United States (0.14)
Europe > Austria > Vienna (0.14)
Europe > Germany (0.05)

Genre: Research Report > Experimental Study (0.66)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Population synthesis with geographic coordinates

Lenti, Jacopo, Costantini, Lorenzo, Fosch, Ariadna, Monticelli, Anna, Scala, David, Pangallo, Marco

arXiv.org Machine LearningOct-14-2025

It is increasingly important to generate synthetic populations with explicit coordinates rather than coarse geographic areas, yet no established methods exist to achieve this. One reason is that latitude and longitude differ from other continuous variables, exhibiting large empty spaces and highly uneven densities. To address this, we propose a population synthesis algorithm that first maps spatial coordinates into a more regular latent space using Normalizing Flows (NF), and then combines them with other features in a Variational Autoencoder (VAE) to generate synthetic populations. This approach also learns the joint distribution between spatial and non-spatial features, exploiting spatial autocorrelations. We demonstrate the method by generating synthetic homes with the same statistical properties of real homes in 121 datasets, corresponding to diverse geographies. We further propose an evaluation framework that measures both spatial accuracy and practical utility, while ensuring privacy preservation. Our results show that the NF+VAE architecture outperforms popular benchmarks, including copula-based methods and uniform allocation within geographic areas. The ability to generate geolocated synthetic populations at fine spatial resolution opens the door to applications requiring detailed geography, from household responses to floods, to epidemic spread, evacuation planning, and transport modeling.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2510.09669

Country:

Europe > Italy (0.30)
North America > United States (0.28)

Genre: Research Report > New Finding (0.86)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
Health & Medicine (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Add feedback

Target Population Synthesis using CT-GAN

Rastogi, Tanay, Jonsson, Daniel

arXiv.org Artificial IntelligenceOct-2-2025

Agent-based models used in scenario planning for transportation and urban planning usually require detailed population information from the base as well as target scenarios. These populations are usually provided by synthesizing fake agents through deterministic population synthesis methods. However, these deterministic population synthesis methods face several challenges, such as handling high-dimensional data, scalability, and zero-cell issues, particularly when generating populations for target scenarios. This research looks into how a deep generative model called Conditional Tabular Generative Adversarial Network (CT-GAN) can be used to create target populations either directly from a collection of marginal constraints or through a hybrid method that combines CT-GAN with Fitness-based Synthesis Combinatorial Optimization (FBS-CO). The research evaluates the proposed population synthesis models against travel survey and zonal-level aggregated population data. Results indicate that the stand-alone CT-GAN model performs the best when compared with FBS-CO and the hybrid model. CT-GAN by itself can create realistic-looking groups that match single-variable distributions, but it struggles to maintain relationships between multiple variables. However, the hybrid model demonstrates improved performance compared to FBS-CO by leveraging CT-GAN ability to generate a descriptive base population, which is then refined using FBS-CO to align with target-year marginals. This study demonstrates that CT-GAN represents an effective methodology for target populations and highlights how deep generative models can be successfully integrated with conventional synthesis techniques to enhance their performance.

artificial intelligence, machine learning, target population, (19 more...)

arXiv.org Artificial Intelligence

2510.00871

Country: Europe > Sweden (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)

Add feedback

Population Synthesis using Incomplete Information

Rastogi, Tanay, Jonsson, Daniel, Karlström, Anders

arXiv.org Artificial IntelligenceOct-2-2025

This paper presents a population synthesis model that utilizes the Wasserstein Generative-Adversarial Network (WGAN) for training on incomplete microsamples. By using a mask matrix to represent missing values, the study proposes a WGAN training algorithm that lets the model learn from a training dataset that has some missing information. The proposed method aims to address the challenge of missing information in microsamples on one or more attributes due to privacy concerns or data collection constraints. The paper contrasts WGAN models trained on incomplete microsamples with those trained on complete microsamples, creating a synthetic population. We conducted a series of evaluations of the proposed method using a Swedish national travel survey. We validate the efficacy of the proposed method by generating synthetic populations from all the models and comparing them to the actual population dataset. The results from the experiments showed that the proposed methodology successfully generates synthetic data that closely resembles a model trained with complete data as well as the actual population. The paper contributes to the field by providing a robust solution for population synthesis with incomplete data, opening avenues for future research, and highlighting the potential of deep generative models in advancing population synthesis capabilities.

artificial intelligence, machine learning, wgan model, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.trpro.2025.04.011

2510.00859

Country: Europe > Sweden (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.46)
Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Next-Generation Travel Demand Modeling with a Generative Framework for Household Activity Coordination

Liao, Xishun, Ma, Haoxuan, Liu, Yifan, Wei, Yuxiang, He, Brian Yueshuai, Stanford, Chris, Ma, Jiaqi

arXiv.org Artificial IntelligenceJul-15-2025

Next-Generation Travel Demand Modeling with a Generative Framework for Household Activity Coordination Xishun Liao 1, Haoxuan Ma 1, Yifan Liu 1, Y uxiang Wei 1, Brian Y ueshuai He 2, Chris Stanford 3, and Jiaqi Ma* 1 Abstract -- Travel demand models are critical tools for planning, policy, and mobility system design. Traditional activity-based models (ABMs), although grounded in behavioral theories, often rely on simplified rules and assumptions, and are costly to develop and difficult to adapt across different regions. This paper presents a learning-based travel demand modeling framework that synthesizes household-coordinated daily activity patterns based on a household's socio-demographic profiles. The whole framework integrates population synthesis, coordinated activity generation, location assignment, and large-scale microscopic traffic simulation into a unified system. It is fully generative, data-driven, scalable, and transferable to other regions. A full-pipeline implementation is conducted in Los Angeles with a 10 million population. Comprehensive validation shows that the model closely replicates real-world mobility patterns and matches the performance of legacy ABMs with significantly reduced modeling cost and greater scalability. With respect to the SCAG ABM benchmark, the origin-destination matrix achieves a cosine similarity of 0.97, and the daily vehicle miles traveled (VMT) in the network yields a 0.006 Jensen-Shannon Divergence (JSD) and a 9.8% mean absolute percentage error (MAPE).

artificial intelligence, household member, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.08871

Country: North America > United States > California > Los Angeles County > Los Angeles (0.35)

Genre:

Research Report (0.50)
Overview (0.40)

Industry:

Government > Regional Government (0.68)
Transportation > Infrastructure & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)
Information Technology > Modeling & Simulation (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Large Language Model for Feasible and Diverse Population Synthesis

Lim, Sung Yoo, Yun, Hyunsoo, Bansal, Prateek, Kim, Dong-Kyu, Kim, Eui-Jin

arXiv.org Artificial IntelligenceMay-8-2025

Generating a synthetic population that is both feasible and diverse is crucial for ensuring the validity of downstream activity schedul e simulation in activity - based models (ABMs) . While deep generative models (DGMs), such as variational autoencoders and g enerative adversarial networks, have been applied to this task, they often struggle to balance the inclusion of rare but plausible combinations (i.e., sampling zeros) with the exclusion of implausible ones (i.e., structural zeros). To improve feasibility while maintaining diversity, we propose a fine - tuning method for large language models (LLMs) that explicitly controls the autoregressive generation process through topological orderings derived from a Bayesian Network (BN). Experimental result s show that our hybrid LLM - BN approach outperform s both traditional DGMs and proprietary LLMs (e.g., ChatGPT - 4o) with few - shot learning. Specifically, our approach achieves approximately 95% feasibility -- significantly higher than the ~80% observed in DGMs -- w hile maintaining comparable diversity, making it well - suited for practical applications. Importantly, the method is based on a lightweight open - source LLM, enabling fine - tuning and inference on standard personal computing environments. This makes the appro ach cost - effective and scalable for large - scale applications, such as synthesizing populations in megacities, without relying on expensive infrastructure. By initiating the ABM pipeline with high - quality synthetic populations, our method improves overall s imulation reliability and reduces downstream error propagation. The source code for these methods is available for research and practical application.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.04196

Country: Asia > South Korea (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Transportation (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback

Value-Enriched Population Synthesis: Integrating a Motivational Layer

Aguilera, Alba, Albertí, Miquel, Osman, Nardine, Curto, Georgina

arXiv.org Artificial IntelligenceAug-18-2024

In recent years, computational improvements have allowed for more nuanced, data-driven and geographically explicit agent-based simulations. So far, simulations have struggled to adequately represent the attributes that motivate the actions of the agents. In fact, existing population synthesis frameworks generate agent profiles limited to socio-demographic attributes. In this paper, we introduce a novel value-enriched population synthesis framework that integrates a motivational layer with the traditional individual and household socio-demographic layers. Our research highlights the significance of extending the profile of agents in synthetic populations by incorporating data on values, ideologies, opinions and vital priorities, which motivate the agents' behaviour. This motivational layer can help us develop a more nuanced decision-making mechanism for the agents in social simulation settings. Our methodology integrates microdata and macrodata within different Bayesian network structures. This contribution allows to generate synthetic populations with integrated value systems that preserve the inherent socio-demographic distributions of the real population in any specific region.

data source, motivational, synthetic population, (17 more...)

arXiv.org Artificial Intelligence

2408.09407

Country:

Oceania > Australia (0.14)
Europe > Spain > Catalonia (0.05)
North America > Canada (0.04)
(9 more...)

Genre: Research Report (0.40)

Industry:

Government (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

A multi-objective combinatorial optimisation framework for large scale hierarchical population synthesis

Mahmood, Imran, Bishop, Nicholas, Calinescu, Anisoara, Wooldridge, Michael, Zachos, Ioannis

arXiv.org Artificial IntelligenceJul-3-2024

In agent-based simulations, synthetic populations of agents are commonly used to represent the structure, behaviour, and interactions of individuals. However, generating a synthetic population that accurately reflects real population statistics is a challenging task, particularly when performed at scale. In this paper, we propose a multi objective combinatorial optimisation technique for large scale population synthesis. We demonstrate the effectiveness of our approach by generating a synthetic population for selected regions and validating it on contingency tables from real population data. Our approach supports complex hierarchical structures between individuals and households, is scalable to large populations and achieves minimal contigency table reconstruction error. Hence, it provides a useful tool for policymakers and researchers for simulating the dynamics of complex populations.

algorithm, objective, synthetic population, (16 more...)

arXiv.org Artificial Intelligence

2407.0318

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > District of Columbia > Washington (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine (0.69)
Government (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.90)

Add feedback

A Deep Generative Framework for Joint Households and Individuals Population Synthesis

Qian, Xiao, Gangwal, Utkarsh, Dong, Shangjia, Davidson, Rachel

arXiv.org Artificial IntelligenceJun-30-2024

Household and individual-level sociodemographic data are essential for understanding human-infrastructure interaction and policymaking. However, the Public Use Microdata Sample (PUMS) offers only a sample at the state level, while census tract data only provides the marginal distributions of variables without correlations. Therefore, we need an accurate synthetic population dataset that maintains consistent variable correlations observed in microdata, preserves household-individual and individual-individual relationships, adheres to state-level statistics, and accurately represents the geographic distribution of the population. We propose a deep generative framework leveraging the variational autoencoder (VAE) to generate a synthetic population with the aforementioned features. The methodological contributions include (1) a new data structure for capturing household-individual and individual-individual relationships, (2) a transfer learning process with pre-training and fine-tuning steps to generate households and individuals whose aggregated distributions align with the census tract marginal distribution, and (3) decoupled binary cross-entropy (D-BCE) loss function enabling distribution shift and out-of-sample records generation. Model results for an application in Delaware, USA demonstrate the ability to ensure the realism of generated household-individual records and accurately describe population statistics at the census tract level compared to existing methods. Furthermore, testing in North Carolina, USA yielded promising results, supporting the transferability of our method.

household, marginal distribution, microdata, (11 more...)

arXiv.org Artificial Intelligence

2407.01643

Country:

North America > United States > North Carolina (0.25)
North America > United States > Delaware > New Castle County > Newark (0.14)
Europe > United Kingdom (0.14)
(6 more...)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

Add feedback

Isolated pulsar population synthesis with simulation-based inference

Graber, Vanessa, Ronchi, Michele, Pardo-Araujo, Celsa, Rea, Nanda

arXiv.org Machine LearningDec-22-2023

We combine pulsar population synthesis with simulation-based inference to constrain the magneto-rotational properties of isolated Galactic radio pulsars. We first develop a flexible framework to model neutron-star birth properties and evolution, focusing on their dynamical, rotational and magnetic characteristics. In particular, we sample initial magnetic-field strengths, $B$, and spin periods, $P$, from log-normal distributions and capture the late-time magnetic-field decay with a power law. Each log-normal is described by a mean, $\mu_{\log B}, \mu_{\log P}$, and standard deviation, $\sigma_{\log B}, \sigma_{\log P}$, while the power law is characterized by the index, $a_{\rm late}$, resulting in five free parameters. We subsequently model the stars' radio emission and observational biases to mimic detections with three radio surveys, and produce a large database of synthetic $P$-$\dot{P}$ diagrams by varying our input parameters. We then follow a simulation-based inference approach that focuses on neural posterior estimation and employ this database to train deep neural networks to directly infer the posterior distributions of the five model parameters. After successfully validating these individual neural density estimators on simulated data, we use an ensemble of networks to infer the posterior distributions for the observed pulsar population. We obtain $\mu_{\log B} = 13.10^{+0.08}_{-0.10}$, $\sigma_{\log B} = 0.45^{+0.05}_{-0.05}$ and $\mu_{\log P} = -1.00^{+0.26}_{-0.21}$, $\sigma_{\log P} = 0.38^{+0.33}_{-0.18}$ for the log-normal distributions, and $a_{\rm late} = -1.80^{+0.65}_{-0.61}$ for the power law at $95\%$ credible interval. Our approach represents a crucial step towards robust statistical inference for complex population-synthesis frameworks and forms the basis for future multi-wavelength analyses of Galactic pulsars.

artificial intelligence, machine learning, posterior, (18 more...)

arXiv.org Machine Learning

2312.14848

Country:

North America > United States (0.27)
Europe > Spain (0.14)
Europe > Italy (0.14)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas > Upstream (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback