AITopics

2509.19512

Genre: Research Report (1.00)

Industry: Transportation (0.96)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Neural Information Processing SystemsSep-24-2025, 22:02:45 GMT

3e98410c45ea98addec555019bbae8eb-Supplemental.pdf

algorithm, gosprl, sample complexity, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
North America > United States > California (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Machine LearningSep-24-2025

Fast Linear Solvers via AI-Tuned Markov Chain Monte Carlo-based Matrix Inversion

Lebedev, Anton, Lee, Won Kyung, Ghosh, Soumyadip, Yaman, Olha I., Kalantzis, Vassilis, Lu, Yingdong, Nowicki, Tomasz, Ubaru, Shashanka, Horesh, Lior, Alexandrov, Vassil

Large, sparse linear systems are pervasive in modern science and engineering, and Krylov subspace solvers are an established means of solving them. Yet convergence can be slow for ill-conditioned matrices, so practical deployments usually require preconditioners. Markov chain Monte Carlo (MCMC)-based matrix inversion can generate such preconditioners and accelerate Krylov iterations, but its effectiveness depends on parameters whose optima vary across matrices; manual or grid search is costly. We present an AI-driven framework recommending MCMC parameters for a given linear system. A graph neural surrogate predicts preconditioning speed from $A$ and MCMC parameters. A Bayesian acquisition function then chooses the parameter sets most likely to minimise iterations. On a previously unseen ill-conditioned system, the framework achieves better preconditioning with 50\% of the search budget of conventional methods, yielding about a 10\% reduction in iterations to convergence. These results suggest a route for incorporating MCMC-based preconditioners into large-scale systems.

matrix, preconditioner, surrogate model, (9 more...)

arXiv.org Machine Learning

doi: 10.1145/3731599.3767543

2509.18452

Country:

North America > United States > New York (0.05)
Europe > United Kingdom > England > Cheshire > Warrington (0.04)
North America > United States > Missouri > St. Louis County > St. Louis (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.62)

Dube, Rohit, Gautam, Natarajan, Banerjee, Amarnath, Nagarajan, Harsha

Hierarchical Semi-Markov Models with Duration-Aware Dynamics for Activity Sequences

arXiv.org Machine LearningSep-24-2025

Residential electricity demand at granular scales is driven by what people do and for how long. Accurately forecasting this demand for applications like microgrid management and demand response therefore requires generative models that can produce realistic daily activity sequences, capturing both the timing and duration of human behavior. This paper develops a generative model of human activity sequences using nationally representative time-use diaries at a 10-minute resolution. We use this model to quantify which demographic factors are most critical for improving predictive performance. We propose a hierarchical semi-Markov framework that addresses two key modeling challenges. First, a time-inhomogeneous Markov \emph{router} learns the patterns of ``which activity comes next." Second, a semi-Markov \emph{hazard} component explicitly models activity durations, capturing ``how long" activities realistically last. To ensure statistical stability when data are sparse, the model pools information across related demographic groups and time blocks. The entire framework is trained and evaluated using survey design weights to ensure our findings are representative of the U.S. population. On a held-out test set, we demonstrate that explicitly modeling durations with the hazard component provides a substantial and statistically significant improvement over purely Markovian models. Furthermore, our analysis reveals a clear hierarchy of demographic factors: Sex, Day-Type, and Household Size provide the largest predictive gains, while Region and Season, though important for energy calculations, contribute little to predicting the activity sequence itself. The result is an interpretable and robust generator of synthetic activity traces, providing a high-fidelity foundation for downstream energy systems modeling.

activity sequence, probability, sequence, (15 more...)

arXiv.org Machine Learning

2509.18414

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > Texas (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy > Power Industry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

PIGDreamer: Privileged Information Guided World Models for Safe Partially Observable Reinforcement Learning

Huang, Dongchi, Wang, Jiaqi, Li, Yang, Xia, Chunhe, Zhang, Tianle, Zhang, Kaige

Partial observability presents a significant challenge for Safe Reinforcement Learning (Safe RL), as it impedes the identification of potential risks and rewards. Leveraging specific types of privileged information during training to mitigate the effects of partial observability has yielded notable empirical successes. In this paper, we propose Asymmetric Constrained Partially Observable Markov Decision Processes (ACPOMDPs) to theoretically examine the advantages of incorporating privileged information in Safe RL. Building upon ACPOMDPs, we propose the Privileged Information Guided Dreamer (PIGDreamer), a model-based RL approach that leverages privileged information to enhance the agent's safety and performance through privileged representation alignment and an asymmetric actor-critic structure. Our empirical results demonstrate that PIGDreamer significantly outperforms existing Safe RL methods. Furthermore, compared to alternative privileged RL methods, our approach exhibits enhanced performance, robustness, and efficiency. Codes are available at: https://github.com/hggforget/PIGDreamer.

information, machine learning, reinforcement learning, (15 more...)

2508.02159

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Code Driven Planning with Domain-Adaptive Critic

Tian, Zikang, Peng, Shaohui, Huang, Du, Guo, Jiaming, Chen, Ruizhi, Zhang, Rui, Zhang, Xishan, Guo, Yuxuan, Du, Zidong, Guo, Qi, Li, Ling, Pu, Yewen, Hu, Xing, Chen, Yunji

Large Language Models (LLMs) have been widely adopted as task planners for AI agents in sequential decision-making problems, leveraging their extensive world knowledge. However, the gap between their general knowledge and environment-specific requirements often leads to inaccurate plans. To address this, existing approaches rely on frequent LLM queries to iteratively refine plans based on immediate environmental feedback, which incurs substantial query costs. However, this refinement is typically guided by short-term environmental feedback, limiting LLMs from developing plans aligned with long-term rewards. We propose Code Driven Planning with Domain-Adaptive Critic (CoPiC). Instead of relying on frequent queries, CoPiC employs LLMs to generate a diverse set of high-level planning programs, which iteratively produce and refine candidate plans. A trained domain-adaptive critic then evaluates these candidates and selects the one most aligned with long-term rewards for execution. Using high-level planning programs as planner and domain-adaptive critic as estimator, CoPiC improves planning while significantly reducing query costs. Results in ALFWorld, NetHack, and StarCraft II Unit Building show that CoPiC outperforms advanced LLM-based baselines, AdaPlanner and Reflexion, achieving an average (1) 23.33% improvement in success rate and (2) 91.27% reduction in query costs.

large language model, machine learning, natural language, (18 more...)

2509.19077

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

VGGT-DP: Generalizable Robot Control via Vision Foundation Models

Ge, Shijia, Zhang, Yinxin, Xie, Shuzhao, Zhang, Weixiang, Zhou, Mingcai, Wang, Zhi

Visual imitation learning frameworks allow robots to learn manipulation skills from expert demonstrations. While existing approaches mainly focus on policy design, they often neglect the structure and capacity of visual encoders--limiting spatial understanding and generalization. Inspired by biological vision systems, which rely on both visual and proprioceptive cues for robust control, we propose VGGT-DP, a visuomotor policy framework that integrates geometric priors from a pretrained 3D perception model with proprioceptive feedback. We adopt the Visual Geometry Grounded Transformer (VGGT) as the visual encoder and introduce a proprioception-guided visual learning strategy to align perception with internal robot states, improving spatial grounding and closed-loop control. To reduce inference latency, we design a frame-wise token reuse mechanism that compacts multi-view tokens into an efficient spatial representation. We further apply random token pruning to enhance policy robustness and reduce overfitting. Experiments on challenging MetaWorld tasks show that VGGT -DP significantly outperforms strong baselines such as DP and DP3, particularly in precision-critical and long-horizon scenarios.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

2509.18778

Country: Asia > China (0.47)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.35)

LLM-Enhanced Self-Evolving Reinforcement Learning for Multi-Step E-Commerce Payment Fraud Risk Detection

Qu, Bo, Wang, Zhurong, Yagi, Daisuke, Xu, Zhen, Zhao, Yang, Shan, Yinan, Zahradnik, Frank

This paper presents a novel approach to e-commerce payment fraud detection by integrating reinforcement learning (RL) with Large Language Models (LLMs). By framing transaction risk as a multi-step Markov Decision Process (MDP), RL optimizes risk detection across multiple payment stages. Crafting effective reward functions, essential for RL model success, typically requires significant human expertise due to the complexity and variability in design. LLMs, with their advanced reasoning and coding capabilities, are well-suited to refine these functions, offering improvements over traditional methods. Our approach leverages LLMs to iteratively enhance reward functions, achieving better fraud detection accuracy and demonstrating zero-shot capability. Experiments with real-world data confirm the effectiveness, robustness, and resilience of our LLM-enhanced RL framework through long-term evaluations, underscoring the potential of LLMs in advancing industrial RL applications.

large language model, machine learning, natural language, (20 more...)

doi: 10.18653/v1/2025.acl-industry.9

2509.18719

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.48)

Industry:

Information Technology > Services > e-Commerce Services (0.62)
Law Enforcement & Public Safety > Fraud (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Tiwari, Navya, Vazhaeparampil, Joseph, Preston, Victoria

Assistive Decision-Making for Right of Way Navigation at Uncontrolled Intersections

Uncontrolled intersections account for a significant fraction of roadway crashes due to ambiguous right-of-way rules, occlusions, and unpredictable driver behavior. While autonomous vehicle research has explored uncertainty-aware decision making, few systems exist to retrofit human-operated vehicles with assistive navigation support. We present a driver-assist framework for right-of-way reasoning at uncontrolled intersections, formulated as a Partially Observable Markov Decision Process (POMDP). Using a custom simulation testbed with stochastic traffic agents, pedestrians, occlusions, and adversarial scenarios, we evaluate four decision-making approaches: a deterministic finite state machine (FSM), and three probabilistic planners: QMDP, POMCP, and DESPOT. Results show that probabilistic planners outperform the rule-based baseline, achieving up to 97.5 percent collision-free navigation under partial observability, with POMCP prioritizing safety and DESPOT balancing efficiency and runtime feasibility. Our findings highlight the importance of uncertainty-aware planning for driver assistance and motivate future integration of sensor fusion and environment perception modules for real-time deployment in realistic traffic environments.

artificial intelligence, machine learning, vehicle, (17 more...)

2509.18407

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.68)

Industry:

Automobiles & Trucks (0.87)
Transportation > Ground > Road (0.69)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.90)

Mbuya, Jonathan Kabala, Pfoser, Dieter, Anastasopoulos, Antonios

Graph Enhanced Trajectory Anomaly Detection

Trajectory anomaly detection is essential for identifying unusual and unexpected movement patterns in applications ranging from intelligent transportation systems to urban safety and fraud prevention. Existing methods only consider limited aspects of the trajectory nature and its movement space by treating trajectories as sequences of sampled locations, with sampling determined by positioning technology, e.g., GPS, or by high-level abstractions such as staypoints. Trajectories are analyzed in Euclidean space, neglecting the constraints and connectivity information of the underlying movement network, e.g., road or transit networks. The proposed Graph Enhanced Trajectory Anomaly Detection (GETAD) framework tightly integrates road network topology, segment semantics, and historical travel patterns to model trajectory data. GETAD uses a Graph Attention Network to learn road-aware embeddings that capture both physical attributes and transition behavior, and augments these with graph-based positional encodings that reflect the spatial layout of the road network. A Transformer-based decoder models sequential movement, while a multiobjective loss function combining autoregressive prediction and supervised link prediction ensures realistic and structurally coherent representations. To improve the robustness of anomaly detection, we introduce Confidence Weighted Negative Log Likelihood (CW NLL), an anomaly scoring function that emphasizes high-confidence deviations. Experiments on real-world and synthetic datasets demonstrate that GETAD achieves consistent improvements over existing methods, particularly in detecting subtle anomalies in road-constrained environments. These results highlight the benefits of incorporating graph structure and contextual semantics into trajectory modeling, enabling more precise and context-aware anomaly detection.

data mining, machine learning, trajectory, (19 more...)

2509.18386

Country:

Asia (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.16)
North America > United States > Virginia (0.14)

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Infrastructure & Services (0.96)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)