AITopics | adaptive policy

Collaborating Authors

adaptive policy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Global Rewards in Restless Multi-Armed Bandits

Neural Information Processing SystemsDec-24-2025, 16:00:41 GMT

Restless multi-armed bandits (RMAB) extend multi-armed bandits so arm pulls impact future arm states. Despite the success of RMABs, a key limiting assumption is the separability of rewards into a sum across arms. We address this deficiency by proposing restless-multi-armed bandit with global rewards (RMAB-G), a generalization of RMABs to global non-separable rewards. To solve RMAB-G, we develop the Linear-Whittle and Shapley-Whittle indices, which extend Whittle indices from RMABs to RMAB-Gs. We prove approximation bounds which demonstrate how Linear and Shapley-Whittle indices fail for non-linear rewards. To overcome this limitation, we propose two sets of adaptive policies: the first computes indices iteratively and the second combines indices with Monte-Carlo Tree Search (MCTS). Empirically, we demonstrate that adaptive policies outperform both pre-computed index policies and baselines in synthetic and real-world food rescue datasets.

artificial intelligence, name change, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.61)

Add feedback

e75a1c8af8d9438df1057fdaa42913eb-Paper-Conference.pdf

Neural Information Processing SystemsNov-17-2025, 03:41:01 GMT

adaptive policy, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Fast Adaptive Non-Monotone Submodular Maximization Subject to a Knapsack Constraint

Neural Information Processing SystemsNov-15-2025, 07:01:57 GMT

The massive instances occurring in modern-day applications can render existing algorithms prohibitively slow.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
(19 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Bridging Theory and Practice: A Stochastic Learning-Optimization Model for Resilient Automotive Supply Chains

Shahnawaz, Muhammad, Safder, Adeel

arXiv.org Machine LearningNov-11-2025

Supply chain disruptions and volatile demand pose significant challenges to the UK automotive industry, which relies heavily on Just-In-Time (JIT) manufacturing. While qualitative studies highlight the potential of integrating Artificial Intelligence (AI) with traditional optimization, a formal, quantitative demonstration of this synergy is lacking. This paper introduces a novel stochastic learning-optimization framework that integrates Bayesian inference with inventory optimization for supply chain management (SCM). We model a two-echelon inventory system subject to stochastic demand and supply disruptions, comparing a traditional static optimization policy against an adaptive policy where Bayesian learning continuously updates parameter estimates to inform stochastic optimization. Our simulations over 365 periods across three operational scenarios demonstrate that the integrated approach achieves 7.4\% cost reduction in stable environments and 5.7\% improvement during supply disruptions, while revealing important limitations during sudden demand shocks due to the inherent conservatism of Bayesian updating. This work provides mathematical validation for practitioner observations and establishes a formal framework for understanding AI-driven supply chain resilience, while identifying critical boundary conditions for successful implementation.

artificial intelligence, machine learning, optimization, (17 more...)

arXiv.org Machine Learning

2511.06479

Country:

Europe > United Kingdom > Scotland > City of Glasgow > Glasgow (0.04)
Asia > Pakistan > Islamabad Capital Territory > Islamabad (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.70)

Industry: Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.71)

Add feedback

Overfitting in Adaptive Robust Optimization

Zhu, Karl, Bertsimas, Dimitris

arXiv.org Machine LearningSep-23-2025

Adaptive robust optimization (ARO) extends static robust optimization by allowing decisions to depend on the realized uncertainty - weakly dominating static solutions within the modeled uncertainty set. However, ARO makes previous constraints that were independent of uncertainty now dependent, making it vulnerable to additional infeasibilities when realizations fall outside the uncertainty set. This phenomenon of adaptive policies being brittle is analogous to overfitting in machine learning. To mitigate against this, we propose assigning constraint-specific uncertainty set sizes, with harder constraints given stronger probabilistic guarantees. Interpreted through the overfitting lens, this acts as regularization: tighter guarantees shrink adaptive coefficients to ensure stability, while looser ones preserve useful flexibility. This view motivates a principled approach to designing uncertainty sets that balances robustness and adaptivity.

artificial intelligence, constraint, machine learning, (15 more...)

arXiv.org Machine Learning

2509.16451

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

RAPID Quantum Detection and Demodulation of Covert Communications: Breaking the Noise Limit with Solid-State Spin Sensors

Taherpour, Amirhossein, Taherpour, Abbas, Khattab, Tamer

arXiv.org Artificial IntelligenceSep-11-2025

We introduce a comprehensive framework for the detection and demodulation of covert electromagnetic signals using solid-state spin sensors. Our approach, named RAPID, is a two-stage hybrid strategy that leverages nitrogen-vacancy (NV) centers to operate below the classical noise floor employing a robust adaptive policy via imitation and distillation. We first formulate the joint detection and estimation task as a unified stochastic optimal control problem, optimizing a composite Bayesian risk objective under realistic physical constraints. The RAPID algorithm solves this by first computing a robust, non-adaptive baseline protocol grounded in the quantum Fisher information matrix (QFIM), and then using this baseline to warm-start an online, adaptive policy learned via deep reinforcement learning (Soft Actor-Critic). This method dynamically optimizes control pulses, interrogation times, and measurement bases to maximize information gain while actively suppressing non-Markovian noise and decoherence. Numerical simulations demonstrate that the protocol achieves a significant sensitivity gain over static methods, maintains high estimation precision in correlated noise environments, and, when applied to sensor arrays, enables coherent quantum beamforming that achieves Heisenberg-like scaling in precision. This work establishes a theoretically rigorous and practically viable pathway for deploying quantum sensors in security-critical applications such as electronic warfare and covert surveillance.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2509.08171

Country:

North America > United States (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
Asia > Middle East > Iran (0.04)

Genre: Research Report (0.64)

Industry: Government > Military (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

e75a1c8af8d9438df1057fdaa42913eb-Paper-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 15:11:01 GMT

adaptive policy, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning

Wang, Yuanfei, Zhang, Xiaojie, Wu, Ruihai, Li, Yu, Shen, Yan, Wu, Mingdong, He, Zhaofeng, Wang, Yizhou, Dong, Hao

arXiv.org Artificial IntelligenceFeb-16-2025

Articulated object manipulation is a critical capability for robots to perform various tasks in real-world scenarios. Composed of multiple parts connected by joints, articulated objects are endowed with diverse functional mechanisms through complex relative motions. For example, a safe consists of a door, a handle, and a lock, where the door can only be opened when the latch is unlocked. The internal structure, such as the state of a lock or joint angle constraints, cannot be directly observed from visual observation. Consequently, successful manipulation of these objects requires adaptive adjustment based on trial and error rather than a one-time visual inference. However, previous datasets and simulation environments for articulated objects have primarily focused on simple manipulation mechanisms where the complete manipulation process can be inferred from the object's appearance. To enhance the diversity and complexity of adaptive manipulation mechanisms, we build a novel articulated object manipulation environment and equip it with 9 categories of objects. Based on the environment and objects, we further propose an adaptive demonstration collection and 3D visual diffusion-based imitation learning pipeline that learns the adaptive manipulation policy. The effectiveness of our designs and proposed method is validated through both simulation and real-world experiments. Our project page is available at: https://adamanip.github.io

artificial intelligence, machine learning, mechanism, (16 more...)

arXiv.org Artificial Intelligence

2502.11124

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Multi-Agent Loco-Manipulation for Long-Horizon Quadrupedal Pushing

Feng, Yuming, Hong, Chuye, Niu, Yaru, Liu, Shiqi, Yang, Yuxiang, Yu, Wenhao, Zhang, Tingnan, Tan, Jie, Zhao, Ding

arXiv.org Artificial IntelligenceNov-14-2024

Recently, quadrupedal locomotion has achieved significant success, but their manipulation capabilities, particularly in handling large objects, remain limited, restricting their usefulness in demanding real-world applications such as search and rescue, construction, industrial automation, and room organization. This paper tackles the task of obstacle-aware, long-horizon pushing by multiple quadrupedal robots. We propose a hierarchical multi-agent reinforcement learning framework with three levels of control. The high-level controller integrates an RRT planner and a centralized adaptive policy to generate subgoals, while the mid-level controller uses a decentralized goal-conditioned policy to guide the robots toward these sub-goals. A pre-trained low-level locomotion policy executes the movement commands. We evaluate our method against several baselines in simulation, demonstrating significant improvements over baseline approaches, with 36.0% higher success rates and 24.5% reduction in completion time than the best baseline. Our framework successfully enables long-horizon, obstacle-aware manipulation tasks like Push-Cuboid and Push-T on Go1 robots in the real world.

artificial intelligence, machine learning, robot, (16 more...)

arXiv.org Artificial Intelligence

2411.07104

Country: Europe > Netherlands > South Holland > Delft (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback