AITopics

In the field of railway automation, one of the key challenges has been the development of effective computer vision systems due to the limited availability of high-quality, sequential data. Traditional datasets are restricted in scope, lacking the spatio temporal context necessary for real-time decision-making, while alternative solutions introduce issues related to realism and applicability. By focusing on route-specific, contextually relevant cues, we can generate rich, sequential datasets that align more closely with real-world operational logic. The concept of milestone determination allows for the development of targeted, rule-based models that simplify the learning process by eliminating the need for generalized recognition of dynamic components, focusing instead on the critical decision points along a route. We argue that this approach provides a practical framework for training vision agents in controlled, predictable environments, facilitating safer and more efficient machine learning systems for railway automation.

artificial intelligence, information, machine learning, (14 more...)

2510.06229

Country:

North America > United States (0.46)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Transportation > Ground > Rail (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

The Alignment Auditor: A Bayesian Framework for Verifying and Refining LLM Objectives

Bou, Matthieu, Patel, Nyal, Jagota, Arjun, Krishna, Satyapriya, Parbhoo, Sonali

The objectives that Large Language Models (LLMs) implicitly optimize remain dangerously opaque, making trustworthy alignment and auditing a grand challenge. While Inverse Reinforcement Learning (IRL) can infer reward functions from behaviour, existing approaches either produce a single, overconfident reward estimate or fail to address the fundamental ambiguity of the task (non-identifiability). This paper introduces a principled auditing framework that re-frames reward inference from a simple estimation task to a comprehensive process for verification. Our framework leverages Bayesian IRL to not only recover a distribution over objectives but to enable three critical audit capabilities: (i) Quantifying and systematically reducing non-identifiability by demonstrating posterior contraction over sequential rounds of evidence; (ii) Providing actionable, uncertainty-aware diagnostics that expose spurious shortcuts and identify out-of-distribution prompts where the inferred objective cannot be trusted; and (iii) Validating policy-level utility by showing that the refined, low-uncertainty reward can be used directly in RLHF to achieve training dynamics and toxicity reductions comparable to the ground-truth alignment process. Empirically, our framework successfully audits a detoxified LLM, yielding a well-calibrated and interpretable objective that strengthens alignment guarantees. Overall, this work provides a practical toolkit for auditors, safety teams, and regulators to verify what LLMs are truly trying to achieve, moving us toward more trustworthy and accountable AI.

large language model, machine learning, wonderful social innovation, (15 more...)

2510.06096

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Bayesian Distributional Models of Executive Functioning

Kasumba, Robert, Lu, Zeyu, Marticorena, Dom CP, Zhong, Mingyang, Beggs, Paul, Pahor, Anja, Ramani, Geetha, Goffney, Imani, Jaeggi, Susanne M, Seitz, Aaron R, Gardner, Jacob R, Barbour, Dennis L

This study uses controlled simulations with known ground-truth parameters to evaluate how Distributional Latent Variable Models (DLVM) and Bayesian Distributional Active LEarning (DALE) perform in comparison to conventional Independent Maximum Likelihood Estimation (IMLE). DLVM integrates observations across multiple executive function tasks and individuals, allowing parameter estimation even under sparse or incomplete data conditions. DLVM consistently outperformed IMLE, especially under with smaller amounts of data, and converges faster to highly accurate estimates of the true distributions. In a second set of analyses, DALE adaptively guided sampling to maximize information gain, outperforming random sampling and fixed test batteries, particularly within the first 80 trials. These findings establish the advantages of combining DLVM's cross-task inference with DALE's optimal adaptive sampling, providing a principled basis for more efficient cognitive assessments.

artificial intelligence, dlvm, machine learning, (17 more...)

2510.00387

Country:

North America > United States > Maryland (0.28)
North America > United States > Pennsylvania (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Sotopia-RL: Reward Design for Social Intelligence

Yu, Haofei, Qi, Zhengyang, Zhao, Yining, Nottingham, Kolby, Xuan, Keyang, Majumder, Bodhisattwa Prasad, Zhu, Hao, Liang, Paul Pu, You, Jiaxuan

Social intelligence has become a critical capability for large language models (LLMs), enabling them to engage effectively in real-world social tasks such as collaboration and negotiation. Reinforcement learning (RL) is a natural fit for training socially intelligent agents because it allows models to learn sophisticated strategies directly through social interactions without requiring human annotations. However, there are two unique parts about social intelligence tasks: (1) the quality of individual utterances in social interactions is not strictly related to final success; (2) social interactions require multi-dimensional rubrics for success. Therefore, we argue that it is necessary to design rewards for building utterance-level multi-dimensional reward models to facilitate RL training for social intelligence tasks. To address these challenges, we propose Sotopia-RL, a novel framework that refines coarse episode-level feedback into utterance-level, multi-dimensional rewards. Utterance-level credit assignment attributes outcomes to individual utterances, while multi-dimensional rewards capture the full richness of social interactions and reduce reward hacking. Experiments in Sotopia, an open-ended social learning environment, demonstrate that Sotopia-RL achieves state-of-the-art social goal completion scores (7.17 on Sotopia-hard and 8.31 on Sotopia-full), significantly outperforming existing approaches. Ablation studies confirm the necessity of both utterance-level credit assignment and multi-dimensional reward design for RL training.

large language model, machine learning, utterance, (20 more...)

2508.03905

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.67)

Industry: Education > Curriculum (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(4 more...)

Zhou, Fangting, Lischka, Attila, Kulcsar, Balazs, Wu, Jiaming, Chehreghani, Morteza Haghir, Laporte, Gilbert

Learning for routing: A guided review of recent developments and future directions

This paper reviews the current progress in applying machine learning (ML) tools to solve NP-hard combinatorial optimization problems, with a focus on routing problems such as the traveling salesman problem (TSP) and the vehicle routing problem (VRP). Due to the inherent complexity of these problems, exact algorithms often require excessive computational time to find optimal solutions, while heuristics can only provide approximate solutions without guaranteeing optimality. With the recent success of machine learning models, there is a growing trend in proposing and implementing diverse ML techniques to enhance the resolution of these challenging routing problems. We propose a taxonomy categorizing ML-based routing methods into construction-based and improvement-based approaches, highlighting their applicability to various problem characteristics. This review aims to integrate traditional OR methods with state-of-the-art ML techniques, providing a structured framework to guide future research and address emerging VRP variants.

evolutionary algorithm, large language model, machine learning, (19 more...)

doi: 10.1016/j.tre.2025.104278

2507.00218

Country:

Europe (1.00)
North America > United States (0.92)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(7 more...)

Neural Information Processing SystemsOct-8-2025, 23:43:13 GMT

Structure Learning with Adaptive Random Neighborhood Informed MCMC

In this paper, we introduce a novel MCMC sampler, P ARNI-DAG, for a fully-Bayesian approach to the problem of structure learning under observational data.

artificial intelligence, machine learning, proposal, (17 more...)

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Neural Information Processing SystemsOct-8-2025, 23:26:41 GMT

7da558c6bd476ba77f5ba712626bba1a-Paper-Conference.pdf

artificial intelligence, dataset, machine learning, (14 more...)

Country:

Asia > China > Hong Kong (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre: Overview (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(3 more...)

Neural Information Processing SystemsOct-8-2025, 23:25:13 GMT

A Rigorous Link between Deep Ensembles and (Variational) Bayesian Methods Veit D. Wild

We establish the first mathematically rigorous link between Bayesian, variational Bayesian, and ensemble methods. A key step towards this is to reformulate the non-convex optimisation problem typically encountered in deep learning as a convex optimisation in the space of probability measures.

artificial intelligence, bayesian inference, machine learning, (15 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Neural Information Processing SystemsOct-8-2025, 23:24:54 GMT

7cefded8659ccc899196860af674b596-Paper-Conference.pdf

artificial intelligence, bayesian inference, machine learning, (20 more...)

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Neural Information Processing SystemsOct-8-2025, 23:13:32 GMT

Provably Efficient Offline Reinforcement Learning in Regular Decision Processes

Most reinforcement learning (RL) algorithms hinge on the Markovian assumption, i.e. that the underlying system transitions and rewards are Markovian in some natural notion of (observable)

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)