AITopics

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Konstantinidis, Fabian, Sackmann, Moritz, Hofmann, Ulrich, Stiller, Christoph

Toward Efficient and Robust Behavior Models for Multi-Agent Driving Simulation

arXiv.org Artificial IntelligenceDec-11-2025

Scalable multi-agent driving simulation requires behavior models that are both realistic and computationally efficient. We address this by optimizing the behavior model that controls individual traffic participants. To improve efficiency, we adopt an instance-centric scene representation, where each traffic participant and map element is modeled in its own local coordinate frame. This design enables efficient, viewpoint-invariant scene encoding and allows static map tokens to be reused across simulation steps. To model interactions, we employ a query-centric symmetric context encoder with relative positional encodings between local frames. We use Adversarial Inverse Reinforcement Learning to learn the behavior model and propose an adaptive reward transformation that automatically balances robustness and realism during training. Experiments demonstrate that our approach scales efficiently with the number of tokens, significantly reducing training and inference times, while outperforming several agent-centric baselines in terms of positional accuracy and robustness.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2512.05812

Country:

Europe > Germany (0.28)
Asia (0.28)

Genre: Research Report (0.50)

Industry: Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Heuillet, Maxime, Cui, Yufei, Chen, Boxing, Durand, Audrey, Parthasarathi, Prasanna

Nested-ReFT: Efficient Reinforcement Learning for Large Language Model Fine-Tuning via Off-Policy Rollouts

arXiv.org Artificial IntelligenceNov-25-2025

Advanced reasoning in LLMs on challenging domains like mathematical reasoning can be tackled using verifiable rewards based reinforced fine-tuning (ReFT). In standard ReFT frameworks, a behavior model generates multiple completions with answers per problem, for the answer to be then scored by a reward function. While such RL post-training methods demonstrate significant performance improvements across challenging reasoning domains, the computational cost of generating completions during training with multiple inference steps makes the training cost non-trivial. To address this, we draw inspiration from off-policy RL, and speculative decoding to introduce a novel ReFT framework, dubbed Nested-ReFT, where a subset of layers of the target model acts as the behavior model to generate off-policy completions during training. The behavior model configured with dynamic layer skipping per batch during training decreases the inference cost compared to the standard ReFT frameworks. Our theoretical analysis shows that Nested-ReFT yields unbiased gradient estimates with controlled variance. Our empirical analysis demonstrates improved computational efficiency measured as tokens/sec across multiple math reasoning benchmarks and model sizes. Additionally, we explore three variants of bias mitigation to minimize the off-policyness in the gradient updates that allows for maintaining performance that matches the baseline ReFT performance.

large language model, machine learning, reinforcement learning, (17 more...)

2508.10123

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.65)

Neural Information Processing SystemsNov-20-2025, 05:03:16 GMT

Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control

Notably, EDA maintains about 95% of performance and still outperforms several baselines given only 1% of Q-labelled data during fine-tuning.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

Country: Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Robots (0.67)

Neural Information Processing SystemsOct-10-2025, 18:25:02 GMT

Aligning Diffusion Behaviors with Q-functions for Efficient Continuous Control

Notably, EDA maintains about 95% of performance and still outperforms several baselines given only 1% of Q-labelled data during fine-tuning.

arxiv preprint arxiv, diffusion policy, international conference, (12 more...)

Country: Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Robots (0.67)

Neural Information Processing SystemsOct-9-2025, 18:00:25 GMT

08bd07d567d77d6dd8d82a4474706a5e-Paper-Conference.pdf

ai explanation, ai model, explanation, (16 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
(4 more...)

arXiv.org Artificial IntelligenceJul-4-2025

DigiT4TAF -- Bridging Physical and Digital Worlds for Future Transportation Systems

Zipfl, Maximilian, Zwick, Pascal, Schulz, Patrick, Zofka, Marc Rene, Schotschneider, Albert, Gremmelmaier, Helen, Polley, Nikolai, Mütsch, Ferdinand, Simon, Kevin, Gottselig, Fabian, Frey, Michael, Marschall, Sergio, Stark, Akim, Müller, Maximilian, Wehmer, Marek, Kocsis, Mihai, Waldenmayer, Dominic, Schnepf, Florian, Heinrich, Erik, Pletz, Sabrina, Kölle, Matthias, Langbein-Euchner, Karin, Viehl, Alexander, Zöllner, Raoul, Zöllner, J. Marius

In the future, mobility will be strongly shaped by the increasing use of digitalization. Not only will individual road users be highly interconnected, but also the road and associated infrastructure. At that point, a Digital Twin becomes particularly appealing because, unlike a basic simulation, it offers a continuous, bilateral connection linking the real and virtual environments. This paper describes the digital reconstruction used to develop the Digital Twin of the Test Area Autonomous Driving-Baden-Württemberg (TAF-BW), Germany. The TAF-BW offers a variety of different road sections, from high-traffic urban intersections and tunnels to multilane motorways. The test area is equipped with a comprehensive Vehicle-to-Everything (V2X) communication infrastructure and multiple intelligent intersections equipped with camera sensors to facilitate real-time traffic flow monitoring. The generation of authentic data as input for the Digital Twin was achieved by extracting object lists at the intersections. This process was facilitated by the combined utilization of camera images from the intelligent infrastructure and LiDAR sensors mounted on a test vehicle. Using a unified interface, recordings from real-world detections of traffic participants can be resimulated. Additionally, the simulation framework's design and the reconstruction process is discussed. The resulting framework is made publicly available for download and utilization at: https://digit4taf-bw.fzi.de The demonstration uses two case studies to illustrate the application of the digital twin and its interfaces: the analysis of traffic signal systems to optimize traffic flow and the simulation of security-related scenarios in the communications sector.

artificial intelligence, digital twin, traffic participant, (16 more...)

2507.024

Country: Europe > Germany > Baden-Württemberg (0.49)

Genre: Research Report (0.50)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.49)

Takahashi, Tatsuki, Maru, Chihiro, Shoji, Hiroko

Off-Policy Evaluation of Ranking Policies via Embedding-Space User Behavior Modeling

arXiv.org Machine LearningJun-3-2025

Off-policy evaluation (OPE) in ranking settings with large ranking action spaces, which stems from an increase in both the number of unique actions and length of the ranking, is essential for assessing new recommender policies using only logged bandit data from previous versions. To address the high variance issues associated with existing estimators, we introduce two new assumptions: no direct effect on rankings and user behavior model on ranking embedding spaces. We then propose the generalized marginalized inverse propensity score (GMIPS) estimator with statistically desirable properties compared to existing ones. Finally, we demonstrate that the GMIPS achieves the lowest MSE. Notably, among GMIPS variants, the marginalized reward interaction IPS (MRIPS) incorporates a doubly marginalized importance weight based on a cascade behavior assumption on ranking embeddings. MRIPS effectively balances the trade-off between bias and variance, even as the ranking action spaces increase and the above assumptions may not hold, as evidenced by our experiments.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

2506.00446

Country: Asia > Japan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

arXiv.org Artificial IntelligenceMay-2-2025

AI2-Active Safety: AI-enabled Interaction-aware Active Safety Analysis with Vehicle Dynamics

Wu, Keshu, Li, Zihao, Li, Sixu, Ye, Xinyue, Lord, Dominique, Zhou, Yang

This paper introduces an AI-enabled, interaction-aware active safety analysis framework that accounts for groupwise vehicle interactions. Specifically, the framework employs a bicycle model-augmented with road gradient considerations-to accurately capture vehicle dynamics. In parallel, a hypergraph-based AI model is developed to predict probabilistic trajectories of ambient traffic. By integrating these two components, the framework derives vehicle intra-spacing over a 3D road surface as the solution of a stochastic ordinary differential equation, yielding high-fidelity surrogate safety measures such as time-to-collision (TTC). To demonstrate its effectiveness, the framework is analyzed using stochastic numerical methods comprising 4th-order Runge-Kutta integration and AI inference, generating probability-weighted high-fidelity TTC (HF-TTC) distributions that reflect complex multi-agent maneuvers and behavioral uncertainties. Evaluated with HF-TTC against traditional constant-velocity TTC and non-interaction-aware approaches on highway datasets, the proposed framework offers a systematic methodology for active safety analysis with enhanced potential for improving safety perception in complex traffic environments.

artificial intelligence, machine learning, vehicle, (15 more...)

2505.00322

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Artificial IntelligenceApr-29-2025

CHARMS: A Cognitive Hierarchical Agent for Reasoning and Motion Stylization in Autonomous Driving

Wang, Jingyi, Chu, Duanfeng, Deng, Zejian, Lu, Liping, Wang, Jinxiang, Sun, Chen

To address the limitations of these approaches, we propose CHARMS, a decision-making model based on Level-k game theory [20]. The distinction between our approach and the existing methods is illustrated in Figure 1. CHARMS incorporates cognitive hierarchy theory to model diverse reasoning depths among agents, coupled with Social V alue Orientation (SVO) to capture individual preferences in driving behavior. We employ a two-stage training process consisting of reinforcement learning pretraining and supervised fine-tuning (SFT) to generate decision-making models that exhibit a wide range of human-like driving styles. Additionally, we integrate Poisson cognitive hierarchy (PCH) theory to enable CHARMS to generate more complex simulation scenarios with diverse vehicle styles. The main contributions of this paper can be summarized as follows. A behavior model integrating Level-k reasoning and SVO is proposed to simulate cognitively diverse driving styles. A two-stage training scheme (DRL + SFT) ensures both style distinctiveness and behavioral realism. A scenario generation method based on PCH theory is used to control driving style distributions, with the aim of creating more realistic and behaviorally diverse simulation scenarios.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

2504.0245

Country: Asia > China > Hubei Province (0.15)

Genre: Research Report (0.64)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.67)