AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.89)

Neural Information Processing SystemsMar-21-2026, 13:49:43 GMT

Language-Driven Interactive Traffic Trajectory Generation

Realistic trajectory generation with natural language control is pivotal for advancing autonomous vehicle technology. However, previous methods focus on individual traffic participant trajectory generation, thus failing to account for the complexity of interactive traffic dynamics. In this work, we propose InteractTraj, the first language-driven traffic trajectory generator that can generate interactive traffic trajectories. InteractTraj interprets abstract trajectory descriptions into concrete formatted interaction-aware numerical codes and learns a mapping between these formatted codes and the final interactive trajectories. To interpret language descriptions, we propose a language-to-code encoder with a novel interaction-aware encoding strategy. To produce interactive traffic trajectories, we propose a code-to-trajectory decoder with interaction-aware feature aggregation that synergizes vehicle interactions with the environmental map and the vehicle moves. Extensive experiments show our method demonstrates superior performance over previous SoTA methods, offering a more realistic generation of interactive traffic trajectories with high controllability via diverse natural language commands.

artificial intelligence, interactive traffic trajectory, proceedings, (3 more...)

Technology: Information Technology > Artificial Intelligence (1.00)

Neural Information Processing SystemsFeb-17-2026, 04:23:01 GMT

cd9b4a28fb9eebe0430c3312a4898a41-Paper-Conference.pdf

data mining, machine learning, trajectory, (19 more...)

Country:

Asia > China > Sichuan Province > Chengdu (0.05)
Asia > China > Shaanxi Province > Xi'an (0.05)
Asia > China > Hong Kong (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Transportation (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Neural Information Processing SystemsFeb-16-2026, 13:51:46 GMT

Language-Driven Interactive Traffic Trajectory Generation

In this work, we propose InteractTraj, the first language-driven traffic trajectory generator that can generate interactive traffic trajectories.

large language model, machine learning, trajectory, (20 more...)

Country:

North America > United States (0.14)
Asia > China > Shanghai > Shanghai (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Information Technology (0.93)
Transportation > Ground > Road (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

arXiv.org Artificial IntelligenceDec-10-2025

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving

Song, Ziying, Liu, Lin, Pan, Hongyu, Liao, Bencheng, Guo, Mingzhe, Yang, Lei, Zhang, Yongchang, Xu, Shaoqing, Jia, Caiyan, Luo, Yadan

Most end-to-end autonomous driving methods rely on imitation learning from single expert demonstrations, often leading to conservative and homogeneous behaviors that limit generalization in complex real-world scenarios. In this work, we propose DIVER, an end-to-end driving framework that integrates reinforcement learning with diffusion-based generation to produce diverse and feasible trajectories. At the core of DIVER lies a reinforced diffusion-based generation mechanism. First, the model conditions on map elements and surrounding agents to generate multiple reference trajectories from a single ground-truth trajectory, alleviating the limitations of imitation learning that arise from relying solely on single expert demonstrations. Second, reinforcement learning is employed to guide the diffusion process, where reward-based supervision enforces safety and diversity constraints on the generated trajectories, thereby enhancing their practicality and generalization capability. Furthermore, to address the limitations of L2-based open-loop metrics in capturing trajectory diversity, we propose a novel Diversity metric to evaluate the diversity of multi-mode predictions.Extensive experiments on the closed-loop NAVSIM and Bench2Drive benchmarks, as well as the open-loop nuScenes dataset, demonstrate that DIVER significantly improves trajectory diversity, effectively addressing the mode collapse problem inherent in imitation learning.

machine learning, reinforcement learning, trajectory, (15 more...)

2507.04049

Country: Asia > China (0.69)

Genre: Research Report > New Finding (0.46)

Industry:

Education > Educational Setting (0.93)
Transportation > Ground > Road (0.87)
Information Technology > Robotics & Automation (0.63)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceDec-9-2025

M-STAR: Multi-Scale Spatiotemporal Autoregression for Human Mobility Modeling

Luo, Yuxiao, Zhang, Songming, Ruan, Sijie, Chen, Siran, Liu, Kang, Xu, Yang, Zheng, Yu, Yin, Ling

Modeling human mobility is vital for extensive applications such as transportation planning and epidemic modeling. With the rise of the Artificial Intelligence Generated Content (AIGC) paradigm, recent works explore synthetic trajectory generation using autoregressive and diffusion models. While these methods show promise for generating single-day trajectories, they remain limited by inefficiencies in long-term generation (e.g., weekly trajectories) and a lack of explicit spatiotemporal multi-scale modeling. This study proposes Multi-Scale Spatio-Temporal AutoRegression (M-STAR), a new framework that generates long-term trajectories through a coarse-to-fine spatiotemporal prediction process. M-STAR combines a Multi-scale Spatiotemporal Tokenizer that encodes hierarchical mobility patterns with a Transformer-based decoder for next-scale autoregressive prediction. Experiments on two real-world datasets show that M-STAR outperforms existing methods in fidelity and significantly improves generation speed. The data and codes are available at https://github.com/YuxiaoLuo0013/M-STAR.

data mining, machine learning, trajectory, (19 more...)

2512.07314

Country:

Asia > China (1.00)
North America > United States > Tennessee (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Zhura, Iana, Karaf, Sausar, Batool, Faryal, Mudalige, Nipun Dhananjaya Weerakkodi, Serpiva, Valerii, Abdulkarim, Ali Alridha, Fedoseev, Aleksey, Seyidov, Didar, Amjad, Hajira, Tsetserukou, Dzmitry

SwarmDiffusion: End-To-End Traversability-Guided Diffusion for Embodiment-Agnostic Navigation of Heterogeneous Robots

arXiv.org Artificial IntelligenceDec-9-2025

Abstract--Visual traversability estimation is critical for autonomous navigation, but existing VLM-based methods rely on hand-crafted prompts, generalize poorly across embodiments, and output only traversability maps, leaving trajectory generation to slow external planners. We propose SwarmDiffusion, a lightweight end-to-end diffusion model that jointly predicts traversability and generates a feasible trajectory from a single RGB image. T o remove the need for annotated or planner-produced paths, we introduce a planner-free trajectory construction pipeline based on randomized way-point sampling, B ezier smoothing, and regularization enforcing connectivity, safety, directionality, and path thinness. This enables learning stable motion priors without demonstrations. SwarmDiffusion leverages VLM-derived supervision without prompt engineering and conditions the diffusion process on a compact embodiment state, producing physically consistent, traversable paths that transfer across different robot platforms. Across indoor environments and two embodiments (quadruped and aerial), the method achieves 80-100% navigation success and 0.09s inference, and adapts to a new robot using only 500 additional visual samples. ELIABLE indoor navigation is fundamental to a wide range of robotic applications, including warehouse automation [1], industrial inspection [2], search and rescue, and autonomous logistics. In these settings, robots must continuously reason about where they can safely move and how to plan a feasible trajectory through cluttered, unstructured, and dynamic spaces.

machine learning, reinforcement learning, trajectory, (19 more...)

2512.02851

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.46)

Larsen, Olav Finne Praesteng, Ruocco, Massimiliano, Spitieris, Michail, Murad, Abdulmajid, Ragosta, Martina

Learning to Land Anywhere: Transferable Generative Models for Aircraft Trajectories

arXiv.org Artificial IntelligenceNov-7-2025

Access to trajectory data is a key requirement for developing and validating Air Traffic Management (ATM) solutions, yet many secondary and regional airports face severe data scarcity. This limits the applicability of machine learning methods and the ability to perform large-scale simulations or "what-if" analyses. In this paper, we investigate whether generative models trained on data-rich airports can be efficiently adapted to data-scarce airports using transfer learning. We adapt state-of-the-art diffusion- and flow-matching-based architectures to the aviation domain and evaluate their transferability between Zurich (source) and Dublin (target) landing trajectory datasets. Models are pretrained on Zurich and fine-tuned on Dublin with varying amounts of local data, ranging from 0% to 100%. Results show that diffusion-based models achieve competitive performance with as little as 5% of the Dublin data and reach baseline-level performance around 20%, consistently outperforming models trained from scratch across metrics and visual inspections. Latent flow matching and latent diffusion models also benefit from pretraining, though with more variable gains, while flow matching models show weaker generalization. Despite challenges in capturing rare trajectory patterns, these findings demonstrate the potential of transfer learning to substantially reduce data requirements for trajectory generation in ATM, enabling realistic synthetic data generation even in environments with limited historical records.

machine learning, natural language, trajectory, (14 more...)

2511.04155

Country:

Europe > Switzerland > Zürich > Zürich (0.44)
Europe > Norway (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.63)

arXiv.org Artificial IntelligenceNov-5-2025

A Unified Model for Human Mobility Generation in Natural Disasters

Long, Qingyue, Wang, Huandong, Wang, Qi Ryan, Li, Yong

Human mobility generation in disaster scenarios plays a vital role in resource allocation, emergency response, and rescue coordination. During disasters such as wildfires and hurricanes, human mobility patterns often deviate from their normal states, which makes the task more challenging. However, existing works usually rely on limited data from a single city or specific disaster, significantly restricting the model's generalization capability in new scenarios. In fact, disasters are highly sudden and unpredictable, and any city may encounter new types of disasters without prior experience. Therefore, we aim to develop a one-for-all model for mobility generation that can generalize to new disaster scenarios. However, building a universal framework faces two key challenges: 1) the diversity of disaster types and 2) the heterogeneity among different cities. In this work, we propose a unified model for human mobility generation in natural disasters (named UniDisMob). To enable cross-disaster generalization, we design physics-informed prompt and physics-guided alignment that leverage the underlying common patterns in mobility changes after different disasters to guide the generation process. To achieve cross-city generalization, we introduce a meta-learning framework that extracts universal patterns across multiple cities through shared parameters and captures city-specific features via private parameters. Extensive experiments across multiple cities and disaster scenarios demonstrate that our method significantly outperforms state-of-the-art baselines, achieving an average performance improvement exceeding 13%.

artificial intelligence, machine learning, natural language, (19 more...)

2511.01928

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceOct-22-2025

HouseTour: A Virtual Real Estate A(I)gent

Çelen, Ata, Pollefeys, Marc, Barath, Daniel, Armeni, Iro

W e introduce HouseT our, a method for spatially-aware 3D camera trajectory and natural language summary generation from a collection of images depicting an existing 3D space. Unlike existing vision-language models (VLMs), which struggle with geometric reasoning, our approach generates smooth video trajectories via a diffusion process constrained by known camera poses and integrates this information into the VLM for 3D-grounded descriptions. W e synthesize the final video using 3D Gaussian splatting to render novel views along the trajectory. T o support this task, we present the HouseT our dataset, which includes over 1,200 house-tour videos with camera poses, 3D reconstructions, and real estate descriptions. Experiments demonstrate that incorporating 3D camera trajectories into the text generation process improves performance over methods handling each task independently. W e evaluate both individual and end-to-end performance, introducing a new joint metric. Our work enables automated, professional-quality video creation for real estate and touristic applications without requiring specialized expertise or equipment.

large language model, machine learning, trajectory, (16 more...)

2510.18054

Genre: Research Report > New Finding (0.46)

Industry: Banking & Finance > Real Estate (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)