AITopics

2509.13735

Country: Asia > China (0.28)

Genre:

Research Report > Promising Solution (0.54)
Research Report > New Finding (0.34)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

arXiv.org Artificial IntelligenceSep-18-2025

Cooperative Target Detection with AUVs: A Dual-Timescale Hierarchical MARDL Approach

Xueyao, Zhang, Bo, Yang, Zhiwen, Yu, Xuelin, Cao, Alexandropoulos, George C., Debbah, Merouane, Yuen, Chau

Autonomous Underwater Vehicles (AUVs) have shown great potential for cooperative detection and reconnaissance. However, collaborative AUV communications introduce risks of exposure. In adversarial environments, achieving efficient collaboration while ensuring covert operations becomes a key challenge for underwater cooperative missions. In this paper, we propose a novel dual time-scale Hierarchical Multi-Agent Proximal Policy Optimization (H-MAPPO) framework. The high-level component determines the individuals participating in the task based on a central AUV, while the low-level component reduces exposure probabilities through power and trajectory control by the participating AUVs. Simulation results show that the proposed framework achieves rapid convergence, outperforms benchmark algorithms in terms of performance, and maximizes long-term cooperative efficiency while ensuring covert operations.

evolutionary algorithm, machine learning, reinforcement learning, (18 more...)

2509.13381

Country:

Asia > China (0.47)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
(2 more...)

Luong, Phung Duc, Bao, Le Tran Gia, Tam, Nguyen Vu Khai, Khoa, Dong Huu Nguyen, Quyen, Nguyen Huu, Pham, Van-Hau, Duy, Phan The

xOffense: An AI-driven autonomous penetration testing framework with offensive knowledge-enhanced LLMs and multi agent systems

This work introduces xOffense, an AI-driven, multi-agent penetration testing framework that shifts the process from labor-intensive, expert-driven manual efforts to fully automated, machine-executable workflows capable of scaling seamlessly with computational infrastructure. At its core, xOffense leverages a fine-tuned, mid-scale open-source LLM (Qwen3-32B) to drive reasoning and decision-making in penetration testing. The framework assigns specialized agents to reconnaissance, vulnerability scanning, and exploitation, with an orchestration layer ensuring seamless coordination across phases. Fine-tuning on Chain-of-Thought penetration testing data further enables the model to generate precise tool commands and perform consistent multi-step reasoning. We evaluate xOffense on two rigorous benchmarks: AutoPenBench and AI-Pentest-Benchmark. The results demonstrate that xOffense consistently outperforms contemporary methods, achieving a sub-task completion rate of 79.17%, decisively surpassing leading systems such as VulnBot and PentestGPT. These findings highlight the potential of domain-adapted mid-scale LLMs, when embedded within structured multi-agent orchestration, to deliver superior, cost-efficient, and reproducible solutions for autonomous penetration testing.

large language model, machine learning, natural language, (20 more...)

2509.13021

Country: Asia > Vietnam (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.68)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

The LLM Already Knows: Estimating LLM-Perceived Question Difficulty via Hidden Representations

Zhu, Yubo, Liu, Dongrui, Lin, Zecheng, Tong, Wei, Zhong, Sheng, Shao, Jing

Estimating the difficulty of input questions as perceived by large language models (LLMs) is essential for accurate performance evaluation and adaptive inference. Existing methods typically rely on repeated response sampling, auxiliary models, or fine-tuning the target model itself, which may incur substantial computational costs or compromise generality. In this paper, we propose a novel approach for difficulty estimation that leverages only the hidden representations produced by the target LLM. We model the token-level generation process as a Markov chain and define a value function to estimate the expected output quality given any hidden state. This allows for efficient and accurate difficulty estimation based solely on the initial hidden state, without generating any output tokens. Extensive experiments across both textual and multimodal tasks demonstrate that our method consistently outperforms existing baselines in difficulty estimation. Moreover, we apply our difficulty estimates to guide adaptive reasoning strategies, including Self-Consistency, Best-of-N, and Self-Refine, achieving higher inference efficiency with fewer generated tokens.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

2509.12886

Country: Asia > China (0.28)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Contrastive Representation Learning for Robust Sim-to-Real Transfer of Adaptive Humanoid Locomotion

Lu, Yidan, Yang, Rurui, Kou, Qiran, Chen, Mengting, Fan, Tao, Cui, Peter, Dong, Yinzhao, Lu, Peng

Abstract-- Reinforcement learning has produced remarkable advances in humanoid locomotion, yet a fundamental dilemma persists for real-world deployment: policies must choose between the robustness of reactive proprioceptive control or the proactivity of complex, fragile perception-driven systems. Our core contribution is a contrastive learning framework that compels the actor's latent state to encode privileged environmental information from simulation. Crucially, this "distilled awareness" empowers an adaptive gait clock, allowing the policy to proactively adjust its rhythm based on an inferred understanding of the terrain. This synergy resolves the classic trade-off between rigid, clocked gaits and unstable clock-free policies. I. INTRODUCTION Achieving stable and adaptive locomotion in unstructured environments is a grand challenge for humanoid robotics. While Deep Reinforcement Learning (DRL) has become a cornerstone for synthesizing such behaviors, a fundamental information gap complicates real-world deployment.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

2509.12858

Country: Asia > China (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

H$^2$R: Hierarchical Hindsight Reflection for Multi-Task LLM Agents

Ye, Shicheng, Yu, Chao, Ke, Kaiqiang, Xu, Chengdong, Wei, Yinqi

Large language model (LLM)-based agents have shown strong potential in multi-task scenarios, owing to their ability to transfer knowledge across diverse tasks. However, existing approaches often treat prior experiences and knowledge as monolithic units, leading to inefficient and coarse-grained knowledge transfer. In this work, we propose a novel hierarchical memory architecture that enables fine-grained knowledge transfer by decoupling high-level planning memory from low-level execution memory. To construct and refine these hierarchical memories, we introduce Hierarchical Hindsight Reflection (H$^2$R), a mechanism that distills reusable and hierarchical knowledge from past agent-environment interactions. At test time, H$^2$R performs retrievals of high-level and low-level memories separately, allowing LLM-based agents to efficiently access and utilize task-relevant knowledge for new tasks.Experimental results across two benchmarks demonstrate that H$^2$R can improve generalization and decision-making performance, outperforming prior baselines such as Expel.

large language model, machine learning, natural language, (15 more...)

2509.1281

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Hao, Alexis Yihong, Wang, Yufei, Ravie, Navin Sriram, Hegde, Bharath, Held, David, Erickson, Zackory

Force-Modulated Visual Policy for Robot-Assisted Dressing with Arm Motions

Robot-assisted dressing has the potential to significantly improve the lives of individuals with mobility impairments. To ensure an effective and comfortable dressing experience, the robot must be able to handle challenging deformable garments, apply appropriate forces, and adapt to limb movements throughout the dressing process. Prior work often makes simplifying assumptions -- such as static human limbs during dressing -- which limits real-world applicability. In this work, we develop a robot-assisted dressing system capable of handling partial observations with visual occlusions, as well as robustly adapting to arm motions during the dressing process. Given a policy trained in simulation with partial observations, we propose a method to fine-tune it in the real world using a small amount of data and multi-modal feedback from vision and force sensing, to further improve the policy's adaptability to arm motions and enhance safety. We evaluate our method in simulation with simplified articulated human meshes and in a real world human study with 12 participants across 264 dressing trials. Our policy successfully dresses two long-sleeve everyday garments onto the participants while being adaptive to various kinds of arm motions, and greatly outperforms prior baselines in terms of task completion and user feedback. Video are available at https://dressing-motion.github.io/.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2509.12741

Country: North America > United States (0.93)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Jones, Scott, Zhou, Liyou, Pattinson, Sebastian W.

Pre-trained Visual Representations Generalize Where it Matters in Model-Based Reinforcement Learning

In visuomotor policy learning, the control policy for the robotic agent is derived directly from visual inputs. The typical approach, where a policy and vision encoder are trained jointly from scratch, generalizes poorly to novel visual scene changes. Using pre-trained vision models (PVMs) to inform a policy network improves robustness in model-free reinforcement learning (MFRL). Recent developments in Model-based reinforcement learning (MBRL) suggest that MBRL is more sample-efficient than MFRL. However, counterintuitively, existing work has found PVMs to be ineffective in MBRL. Here, we investigate PVM's effectiveness in MBRL, specifically on generalization under visual domain shifts. We show that, in scenarios with severe shifts, PVMs perform much better than a baseline model trained from scratch. We further investigate the effects of varying levels of fine-tuning of PVMs. Our results show that partial fine-tuning can maintain the highest average task performance under the most extreme distribution shifts. Our results demonstrate that PVMs are highly successful in promoting robustness in visual policy learning, providing compelling evidence for their wider adoption in model-based robotic learning applications.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2509.12531

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

arXiv.org Artificial IntelligenceSep-16-2025

Approaches to Analysis and Design of AI-Based Autonomous Vehicles

Yan, Tao, Zhang, Zheyu, Jiang, Jingjing, Chen, Wen-Hua

Artificial intelligence (AI) models are becoming key components in an autonomous vehicle (AV), especially in handling complicated perception tasks. However, closing the loop through AI-based feedback may pose significant risks on reliability of autonomous driving due to very limited understanding about the mechanism of AI-driven perception processes. To overcome it, this paper aims to develop tools for modeling, analysis, and synthesis for a class of AI-based AV; in particular, their closed-loop properties, e.g., stability, robustness, and performance, are rigorously studied in the statistical sense. First, we provide a novel modeling means for the AI-driven perception processes by looking at their error characteristics. Specifically, three fundamental AI-induced perception uncertainties are recognized and modeled by Markov chains, Gaussian processes, and bounded disturbances, respectively. By means of that, the closed-loop stochastic stability (SS) is established in the sense of mean square, and then, an SS control synthesis method is presented within the framework of linear matrix inequalities (LMIs). Besides the SS properties, the robustness and performance of AI-based AVs are discussed in terms of a stochastic guaranteed cost, and criteria are given to test the robustness level of an AV when in the presence of AI-induced uncertainties. Furthermore, the stochastic optimal guaranteed cost control is investigated, and an efficient design procedure is developed innovatively based on LMI techniques and convex optimization. Finally, to illustrate the effectiveness, the developed results are applied to an example of car following control, along with extensive simulation.

artificial intelligence, machine learning, vehicle, (15 more...)

2509.12169

Genre: Research Report (0.64)

Industry:

Automobiles & Trucks (0.68)
Energy > Renewable > Geothermal (0.56)
Transportation > Ground > Road (0.36)
Information Technology > Robotics & Automation (0.36)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Liang, Youzhi, Noronha, Eyan

MEMBOT: Memory-Based Robot in Intermittent POMDP

arXiv.org Artificial IntelligenceSep-16-2025

Robotic systems deployed in real-world environments often operate under conditions of partial and often intermittent observability, where sensor inputs may be noisy, occluded, or entirely unavailable due to failures or environmental constraints. Traditional reinforcement learning (RL) approaches that assume full state observability are ill-equipped for such challenges. In this work, we introduce MEMBOT, a modular memory-based architecture designed to address intermittent partial observability in robotic control tasks. MEMBOT decouples belief inference from policy learning through a two-phase training process: an offline multi-task learning pretraining stage that learns a robust task-agnostic latent belief encoder using a reconstruction losses, followed by fine-tuning of task-specific policies using behavior cloning. The belief encoder, implemented as a state-space model (SSM) and a LSTM, integrates temporal sequences of observations and actions to infer latent state representations that persist even when observations are dropped. We train and evaluate MEMBOT on 10 robotic manipulation benchmark tasks from MetaWorld and Robomimic under varying rates of observation dropout. Results show that MEMBOT consistently outperforms both memoryless and naively recurrent baselines, maintaining up to 80% of peak performance under 50% observation availability. These findings highlight the effectiveness of explicit belief modeling in achieving robust, transferable, and data-efficient policies for real-world partially observable robotic systems.

artificial intelligence, machine learning, membot, (17 more...)

2509.11225

Genre: Research Report > New Finding (0.48)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)