Goto

Collaborating Authors

 travel behavior


Generating Individual Travel Diaries Using Large Language Models Informed by Census and Land-Use Data

Amin, Sepehr Golrokh, Rhoads, Devin, Fakhrmoosavi, Fatemeh, Lownes, Nicholas E., Ivan, John N.

arXiv.org Artificial Intelligence

This study introduces a Large Language Model (LLM) scheme for generating individual travel diaries in agent-based transportation models. While traditional approaches rely on large quantities of proprietary household travel surveys, the method presented in this study generates personas stochastically from open-source American Community Survey (ACS) and Smart Location Database (SLD) data, then synthesizes diaries through direct prompting. This study features a novel one-to-cohort realism score: a composite of four metrics (Trip Count Score, Interval Score, Purpose Score, and Mode Score) validated against the Connecticut Statewide Transportation Study (CSTS) diaries, matched across demographic variables. The validation utilizes Jensen-Shannon Divergence to measure distributional similarities between generated and real diaries. When compared to diaries generated with classical methods (Negative Binomial for trip generation; Multinomial Logit for mode/purpose) calibrated on the validation set, LLM-generated diaries achieve comparable overall realism (LLM mean: 0.485 vs. 0.455). The LLM excels in determining trip purpose and demonstrates greater consistency (narrower realism score distribution), while classical models lead in numerical estimates of trip count and activity duration. Aggregate validation confirms the LLM's statistical representativeness (LLM mean: 0.612 vs. 0.435), demonstrating LLM's zero-shot viability and establishing a quantifiable metric of diary realism for future synthetic diary evaluation systems.


MICROTRIPS: MICRO-geography TRavel Intelligence and Pattern Synthesis

Wang, Yangyang, Fabusuyi, Tayo

arXiv.org Artificial Intelligence

This study presents a novel small-area estimation framework to enhance urban transportation planning through detailed characterization of travel behavior. Our approach improves on the four-step travel model by employing publicly available microdata files and machine learning methods to predict travel behavior for a representative, synthetic population at small geographic areas. This approach enables high-resolution estimation of trip generation, trip distribution, mode choice, and route assignment. Validation using ACS/PUMS work-commute datasets demonstrates that our framework achieves higher accuracy compared to conventional approaches. The resulting granular insights enable the tailoring of interventions to address localized situations and support a range of policy applications and targeted interventions, including the optimal placement of micro-fulfillment centers, effective curb-space management, and the design of more inclusive transportation solutions particularly for vulnerable communities.


Recovering Individual-Level Activity Sequences from Location-Based Service Data Using a Novel Transformer-Based Model

Luo, Weiyu, Xiong, Chenfeng

arXiv.org Artificial Intelligence

Word Count: 6, 279 words + 3 table (250 words per table) = 7, 029 words Submitted [ 08/01/2025 ] *Corresponding Author Weiyu Luo, Chenfeng Xiong 2 ABSTR A CT Location - Based Service (LBS) data provides critical insights into human mobility, yet its sparsity often yields incomplete trip and activity sequences, making accurate inferences about trips and activities difficult . We raised a research problem: Can we use activity sequences derived from high - quality LBS data to recover incomplete activity sequences at individual level? This study proposes a new solution, the Variable Selection Network - fused Insertion Transformer (VSNIT), integrating the Insertion Transformer ' s flexible sequence construction with the Variable Selection Network's dynamic covariate handling capability, to recover missing segments in incomplete activity sequences while preserving existing data . The findings show that VSNIT inserts more diverse, realistic activity patterns, more closely matching real - world variability, and restores disrupted activity transiti ons more effectively aligning with the target. It also performs significantly better than the baseline model across all metrics. These results highlight VSNIT ' s superior accuracy and diversity in activity sequence recovery tasks, demonstrating its potential to enhance LBS data utility for mobility analysis. This approach offers a promising framework for future location - based research and applications. Keywords: Sequence - To - Sequence Modeling, Location - Based - Service Data, Data Spar sity, Insertion Transformer, Activity - Based M odeling, Human Mobility Weiyu Luo, Chenfeng Xiong 3 INTRODUCTION Activity - based model Activity - based modeling (ABM) emerged in response to the limitations of traditional trip - based models, providing a more behaviorally appropriate framework for understanding travel demand ( 1 - 3) .


Next-Generation Travel Demand Modeling with a Generative Framework for Household Activity Coordination

Liao, Xishun, Ma, Haoxuan, Liu, Yifan, Wei, Yuxiang, He, Brian Yueshuai, Stanford, Chris, Ma, Jiaqi

arXiv.org Artificial Intelligence

Next-Generation Travel Demand Modeling with a Generative Framework for Household Activity Coordination Xishun Liao 1, Haoxuan Ma 1, Yifan Liu 1, Y uxiang Wei 1, Brian Y ueshuai He 2, Chris Stanford 3, and Jiaqi Ma* 1 Abstract -- Travel demand models are critical tools for planning, policy, and mobility system design. Traditional activity-based models (ABMs), although grounded in behavioral theories, often rely on simplified rules and assumptions, and are costly to develop and difficult to adapt across different regions. This paper presents a learning-based travel demand modeling framework that synthesizes household-coordinated daily activity patterns based on a household's socio-demographic profiles. The whole framework integrates population synthesis, coordinated activity generation, location assignment, and large-scale microscopic traffic simulation into a unified system. It is fully generative, data-driven, scalable, and transferable to other regions. A full-pipeline implementation is conducted in Los Angeles with a 10 million population. Comprehensive validation shows that the model closely replicates real-world mobility patterns and matches the performance of legacy ABMs with significantly reduced modeling cost and greater scalability. With respect to the SCAG ABM benchmark, the origin-destination matrix achieves a cosine similarity of 0.97, and the daily vehicle miles traveled (VMT) in the network yields a 0.006 Jensen-Shannon Divergence (JSD) and a 9.8% mean absolute percentage error (MAPE).


Individual Bus Trip Chain Prediction and Pattern Identification Considering Similarities

Huang, Xiannan, Chen, Yixin, Yuan, Quan, Yang, Chao

arXiv.org Artificial Intelligence

Predicting future bus trip chains for an existing user is of great significance for operators of public transit systems. Existing methods always treat this task as a time-series prediction problem, but the 1-dimensional time series structure cannot express the complex relationship between trips. To better capture the inherent patterns in bus travel behavior, this paper proposes a novel approach that synthesizes future bus trip chains based on those from similar days. Key similarity patterns are defined and tested using real-world data, and a similarity function is then developed to capture these patterns. Afterwards, a graph is constructed where each day is represented as a node and edge weight reflects the similarity between days. Besides, the trips on a given day can be regarded as labels for each node, transferring the bus trip chain prediction problem to a semi-supervised classification problem on a graph. To address this, we propose several methods and validate them on a real-world dataset of 10000 bus users, achieving state-of-the-art prediction results. Analyzing the parameters of similarity function reveals some interesting bus usage patterns, allowing us can to cluster bus users into three types: repeat-dominated, evolve-dominate and repeat-evolve balanced. In summary, our work demonstrates the effectiveness of similarity-based prediction for bus trip chains and provides a new perspective for analyzing individual bus travel patterns. The code for our prediction model is publicly available.


Toward LLM-Agent-Based Modeling of Transportation Systems: A Conceptual Framework

Liu, Tianming, Yang, Jirong, Yin, Yafeng

arXiv.org Artificial Intelligence

In transportation system demand modeling and simulation, agent-based models and microsimulations are current state-of-the-art approaches. However, existing agent-based models still have some limitations on behavioral realism and resource demand that limit their applicability. In this study, leveraging the emerging technology of large language models (LLMs) and LLM-based agents, we propose a general LLM-agent-based modeling framework for transportation systems. We argue that LLM agents not only possess the essential capabilities to function as agents but also offer promising solutions to overcome some limitations of existing agent-based models. Our conceptual framework design closely replicates the decision-making and interaction processes and traits of human travelers within transportation networks, and we demonstrate that the proposed systems can meet critical behavioral criteria for decision-making and learning behaviors using related studies and a demonstrative example of LLM agents' learning and adjustment in the bottleneck setting. Although further refinement of the LLM-agent-based modeling framework is necessary, we believe that this approach has the potential to improve transportation system modeling and simulation.


The built environment and induced transport CO2 emissions: A double machine learning approach to account for residential self-selection

Nachtigall, Florian, Wagner, Felix, Berrill, Peter, Creutzig, Felix

arXiv.org Artificial Intelligence

Understanding why travel behavior differs between residents of urban centers and suburbs is key to sustainable urban planning. Especially in light of rapid urban growth, identifying housing locations that minimize travel demand and induced CO2 emissions is crucial to mitigate climate change. While the built environment plays an important role, the precise impact on travel behavior is obfuscated by residential self-selection. To address this issue, we propose a double machine learning approach to obtain unbiased, spatially-explicit estimates of the effect of the built environment on travel-related CO2 emissions for each neighborhood by controlling for residential self-selection. We examine how socio-demographics and travel-related attitudes moderate the effect and how it decomposes across the 5Ds of the built environment. Based on a case study for Berlin and the travel diaries of 32,000 residents, we find that the built environment causes household travel-related CO2 emissions to differ by a factor of almost two between central and suburban neighborhoods in Berlin. To highlight the practical importance for urban climate mitigation, we evaluate current plans for 64,000 new residential units in terms of total induced transport CO2 emissions. Our findings underscore the significance of spatially differentiated compact development to decarbonize the transport sector.


Optimizing Bus Travel: A Novel Approach to Feature Mining with P-KMEANS and P-LDA Algorithms

Liu, Hongjie, Shi, Haotian, Fu, Sicheng, Yuan, Tengfei, Zhang, Xinhuan, Xu, Hongzhe, Ran, Bin

arXiv.org Artificial Intelligence

Customizing services for bus travel can bolster its attractiveness, optimize usage, alleviate traffic congestion, and diminish carbon emissions. This potential is realized by harnessing recent advancements in positioning communication facilities, the Internet of Things, and artificial intelligence for feature mining in public transportation. However, the inherent complexities of disorganized and unstructured public transportation data introduce substantial challenges to travel feature extraction. This study presents a bus travel feature extraction method rooted in Point of Interest (POI) data, employing enhanced P-KMENAS and P-LDA algorithms to overcome these limitations. While the KMEANS algorithm adeptly segments passenger travel paths into distinct clusters, its outcomes can be influenced by the initial K value. On the other hand, Latent Dirichlet Allocation (LDA) excels at feature identification and probabilistic interpretations yet encounters difficulties with feature intermingling and nuanced sub-feature interactions. Incorporating the POI dimension enhances our understanding of travel behavior, aligning it more closely with passenger attributes and facilitating easier data analysis. By incorporating POI data, our refined P-KMENAS and P-LDA algorithms grant a holistic insight into travel behaviors and attributes, effectively mitigating the limitations above. Consequently, this POI-centric algorithm effectively amalgamates diverse POI attributes, delineates varied travel contexts, and imparts probabilistic metrics to feature properties. Our method successfully mines the diverse aspects of bus travel, such as age, occupation, gender, sports, cost, safety, and personality traits. It effectively calculates relationships between individual travel behaviors and assigns explanatory and evaluative probabilities to POI labels, thereby enhancing bus travel optimization.


Exploring Deep Learning Approaches to Predict Person and Vehicle Trips: An Analysis of NHTS Data

Adu-Gyamfi, Kojo, Anuj, Sharma

arXiv.org Artificial Intelligence

Modern transportation planning relies heavily on accurate predictions of person and vehicle trips. However, traditional planning models often fail to account for the intricacies and dynamics of travel behavior, leading to less-than-optimal accuracy in these predictions. This study explores the potential of deep learning techniques to transform the way we approach trip predictions, and ultimately, transportation planning. Utilizing a comprehensive dataset from the National Household Travel Survey (NHTS), we developed and trained a deep learning model for predicting person and vehicle trips. The proposed model leverages the vast amount of information in the NHTS data, capturing complex, non-linear relationships that were previously overlooked by traditional models. As a result, our deep learning model achieved an impressive accuracy of 98% for person trip prediction and 96% for vehicle trip estimation. This represents a significant improvement over the performances of traditional transportation planning models, thereby demonstrating the power of deep learning in this domain. The implications of this study extend beyond just more accurate predictions. By enhancing the accuracy and reliability of trip prediction models, planners can formulate more effective, data-driven transportation policies, infrastructure, and services. As such, our research underscores the need for the transportation planning field to embrace advanced techniques like deep learning. The detailed methodology, along with a thorough discussion of the results and their implications, are presented in the subsequent sections of this paper.


Human Mobility Prediction with Causal and Spatial-constrained Multi-task Network

Huang, Zongyuan, Xu, Shengyuan, Wang, Menghan, Wu, Hansi, Xu, Yanyan, Jin, Yaohui

arXiv.org Artificial Intelligence

Modeling human mobility helps to understand how people are accessing resources and physically contacting with each other in cities, and thus contributes to various applications such as urban planning, epidemic control, and location-based advertisement. Next location prediction is one decisive task in individual human mobility modeling and is usually viewed as sequence modeling, solved with Markov or RNN-based methods. However, the existing models paid little attention to the logic of individual travel decisions and the reproducibility of the collective behavior of population. To this end, we propose a Causal and Spatial-constrained Long and Short-term Learner (CSLSL) for next location prediction. CSLSL utilizes a causal structure based on multi-task learning to explicitly model the "\textit{when$\rightarrow$what$\rightarrow$where}", a.k.a. "\textit{time$\rightarrow$activity$\rightarrow$location}" decision logic. We next propose a spatial-constrained loss function as an auxiliary task, to ensure the consistency between the predicted and actual spatial distribution of travelers' destinations. Moreover, CSLSL adopts modules named Long and Short-term Capturer (LSC) to learn the transition regularities across different time spans. Extensive experiments on three real-world datasets show promising performance improvements of CSLSL over baselines and confirm the effectiveness of introducing the causality and consistency constraints. The implementation is available at https://github.com/urbanmobility/CSLSL.