AITopics

2408.12822

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Banerjee, Ayan, Maity, Aranyak, Lamrani, Imane, Gupta, Sandeep K. S.

Operational Safety in Human-in-the-loop Human-in-the-plant Autonomous Systems

arXiv.org Artificial IntelligenceAug-22-2024

Control affine assumptions, human inputs are external disturbances, in certified safe controller synthesis approaches are frequently violated in operational deployment under causal human actions. This paper takes a human-in-the-loop human-in-the-plant (HIL-HIP) approach towards ensuring operational safety of safety critical autonomous systems: human and real world controller (RWC) are modeled as a unified system. A three-way interaction is considered: a) through personalized inputs and biological feedback processes between HIP and HIL, b) through sensors and actuators between RWC and HIP, and c) through personalized configuration changes and data feedback between HIL and RWC. We extend control Lyapunov theory by generating barrier function (CLBF) under human action plans, model the HIL as a combination of Markov Chain for spontaneous events and Fuzzy inference system for event responses, the RWC as a black box, and integrate the HIL-HIP model with neural architectures that can learn CLBF certificates. We show that synthesized HIL-HIP controller for automated insulin delivery in Type 1 Diabetes is the only controller to meet safety requirements for human action inputs.

architecture, controller, probability, (15 more...)

2409.0378

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Arizona > Maricopa County > Tempe (0.04)
(7 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

arXiv.org Artificial IntelligenceAug-21-2024

Deep Reinforcement Learning for Decentralized Multi-Robot Control: A DQN Approach to Robustness and Information Integration

Wu, Bin, Suh, C Steve

The superiority of Multi-Robot Systems (MRS) in various complex environments is unquestionable. However, in complex situations such as search and rescue, environmental monitoring, and automated production, robots are often required to work collaboratively without a central control unit. This necessitates an efficient and robust decentralized control mechanism to process local information and guide the robots' behavior. In this work, we propose a new decentralized controller design method that utilizes the Deep Q-Network (DQN) algorithm from deep reinforcement learning, aimed at improving the integration of local information and robustness of multi-robot systems. The designed controller allows each robot to make decisions independently based on its local observations while enhancing the overall system's collaborative efficiency and adaptability to dynamic environments through a shared learning mechanism. Through testing in simulated environments, we have demonstrated the effectiveness of this controller in improving task execution efficiency, strengthening system fault tolerance, and enhancing adaptability to the environment. Furthermore, we explored the impact of DQN parameter tuning on system performance, providing insights for further optimization of the controller design. Our research not only showcases the potential application of the DQN algorithm in the decentralized control of multi-robot systems but also offers a new perspective on how to enhance the overall performance and robustness of the system through the integration of local information.

information, multi-robot system, robot, (16 more...)

2408.11339

Country:

North America > United States > Texas > Brazos County > College Station (0.14)
North America > United States > Oregon > Multnomah County > Portland (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

arXiv.org Artificial IntelligenceAug-21-2024

Bayesian Optimization Framework for Efficient Fleet Design in Autonomous Multi-Robot Exploration

Concha, David Molina, Li, Jiping, Yin, Haoran, Park, Kyeonghyeon, Lee, Hyun-Rok, Lee, Taesik, Sirohi, Dhruv, Lee, Chi-Guhn

This study addresses the challenge of fleet design optimization in the context of heterogeneous multi-robot fleets, aiming to obtain feasible designs that balance performance and costs. In the domain of autonomous multi-robot exploration, reinforcement learning agents play a central role, offering adaptability to complex terrains and facilitating collaboration among robots. However, modifying the fleet composition results in changes in the learned behavior, and training multi-robot systems using multi-agent reinforcement learning is expensive. Therefore, an exhaustive evaluation of each potential fleet design is infeasible. To tackle these hurdles, we introduce Bayesian Optimization for Fleet Design (BOFD), a framework leveraging multi-objective Bayesian Optimization to explore fleets on the Pareto front of performance and cost while accounting for uncertainty in the design space. Moreover, we establish a sub-linear bound for cumulative regret, supporting BOFD's robustness and efficacy. Extensive benchmark experiments in synthetic and simulated environments demonstrate the superiority of our framework over state-of-the-art methods, achieving efficient fleet designs with minimal fleet evaluations.

exploration, fleet design, robot, (15 more...)

2408.11751

Country:

North America > Canada > Ontario > Toronto (0.14)
South America > Chile (0.04)
Oceania > Australia > South Australia > Adelaide (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Mondal, Washim Uddin, Aggarwal, Vaneet

Last-Iterate Convergence of General Parameterized Policies in Constrained MDPs

arXiv.org Artificial IntelligenceAug-21-2024

Constrained Markov Decision Process (CMDP) is a classical framework where an agent repeatedly interacts with an unknown environment to maximize the cumulative discounted rewards while simultaneously ensuring that the cumulative observed costs are within a pre-defined boundary. It finds its application in a multitude of practical scenarios. For example, consider an autonomous vehicle that attempts to reach its destination via the shortest-time route without violating traffic rules or a corporate leader who aims to maximize revenue without crossing a monetary budget. In these cases, any departure from the boundary set by the predefined rules can be signaled by a cost while the progress towards the desired objective can be indicated by a reward. Finding an optimal policy to navigate an unknown CMDP is a difficult task. Nevertheless, several recent articles have proposed algorithms to solve this challenging problem with optimality guarantees.

constraint violation, inequality, sample complexity, (10 more...)

2408.11513

Country:

North America > United States (0.04)
Europe > Montenegro (0.04)
Asia > Middle East > Jordan (0.04)
Asia > India > Uttar Pradesh > Kanpur (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks

Qu, Yun, Wang, Boyuan, Shao, Jianzhun, Jiang, Yuhang, Chen, Chen, Ye, Zhenbin, Liu, Lin, Yang, Junfeng, Lai, Lin, Qin, Hongyang, Deng, Minwen, Zhuo, Juchao, Ye, Deheng, Fu, Qiang, Yang, Wei, Yang, Guang, Huang, Lanxiao, Ji, Xiangyang

The advancement of Offline Reinforcement Learning (RL) and Offline Multi-Agent Reinforcement Learning (MARL) critically depends on the availability of high-quality, pre-collected offline datasets that represent real-world complexities and practical applications. However, existing datasets often fall short in their simplicity and lack of realism. To address this gap, we propose Hokoff, a comprehensive set of pre-collected datasets that covers both offline RL and offline MARL, accompanied by a robust framework, to facilitate further research. This data is derived from Honor of Kings, a recognized Multiplayer Online Battle Arena (MOBA) game known for its intricate nature, closely resembling real-life situations. Utilizing this framework, we benchmark a variety of offline RL and offline MARL algorithms. We also introduce a novel baseline algorithm tailored for the inherent hierarchical action space of the game. We reveal the incompetency of current offline RL approaches in handling task complexity, generalization and multi-task learning.

dataset, hok3v3, reinforcement learning, (10 more...)

2408.10556

Genre: Research Report (1.00)

Industry:

Law (0.67)
Leisure & Entertainment > Games > Computer Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Collis, Poppy, Singh, Ryan, Kinghorn, Paul F, Buckley, Christopher L

Hybrid Recurrent Models Support Emergent Descriptions for Hierarchical Planning and Control

An open problem in artificial intelligence is how systems can flexibly learn discrete abstractions that are useful for solving inherently continuous problems. Previous work has demonstrated that a class of hybrid state-space model known as recurrent switching linear dynamical systems (rSLDS) discover meaningful behavioural units via the piecewise linear decomposition of complex continuous dynamics (Linderman et al., 2016). Furthermore, they model how the underlying continuous states drive these discrete mode switches. We propose that the rich representations formed by an rSLDS can provide useful abstractions for planning and control. We present a novel hierarchical model-based algorithm inspired by Active Inference in which a discrete MDP sits above a low-level linear-quadratic controller. The recurrent transition dynamics learned by the rSLDS allow us to (1) specify temporally-abstracted sub-goals in a method reminiscent of the options framework, (2) lift the exploration into discrete space allowing us to exploit information-theoretic exploration bonuses and (3) `cache' the approximate solutions to low-level problems in the discrete planner. We successfully apply our model to the sparse Continuous Mountain Car task, demonstrating fast system identification via enhanced exploration and non-trivial planning through the delineation of abstract sub-goals.

controller, hierarchical planning, recurrent model support emergent description, (11 more...)

2408.1097

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > East Sussex > Brighton (0.04)

Genre: Research Report (0.50)

Industry: Energy (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Pippas, Nikolaos, Turkay, Cagatay, Ludvig, Elliot A.

The Evolution of Reinforcement Learning in Quantitative Finance

Reinforcement Learning (RL) has experienced significant advancement over the past decade, prompting a growing interest in applications within finance. This survey critically evaluates 167 publications, exploring diverse RL applications and frameworks in finance. Financial markets, marked by their complexity, multi-agent nature, information asymmetry, and inherent randomness, serve as an intriguing test-bed for RL. Traditional finance offers certain solutions, and RL advances these with a more dynamic approach, incorporating machine learning methods, including transfer learning, meta-learning, and multi-agent solutions. This survey dissects key RL components through the lens of Quantitative Finance. We uncover emerging themes, propose areas for future research, and critique the strengths and weaknesses of existing methods.

agent, application, reinforcement, (12 more...)

2408.10932

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.27)
Europe > United Kingdom > England > West Midlands > Coventry (0.04)
North America > United States > New York (0.04)
(11 more...)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material (0.92)

Industry:

Information Technology (1.00)
Banking & Finance > Trading (1.00)
Banking & Finance > Economy (1.00)
Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Klusch, Matthias, Lässig, Jörg, Müssig, Daniel, Macaluso, Antonio, Wilhelm, Frank K.

Quantum Artificial Intelligence: A Brief Survey

Quantum Artificial Intelligence (QAI) is the intersection of quantum computing and AI, a technological synergy with expected significant benefits for both. In this paper, we provide a brief overview of what has been achieved in QAI so far and point to some open questions for future research. In particular, we summarize some major key findings on the feasability and the potential of using quantum computing for solving computationally hard problems in various subfields of AI, and vice versa, the leveraging of AI methods for building and operating quantum computing devices.

algorithm, quantum, quantum computing, (11 more...)

2408.10726

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Saarland > Saarbrücken (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(4 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology (1.00)
Health & Medicine (1.00)
Transportation (0.93)
Energy > Power Industry (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(10 more...)

Tod, Georges, Bruggeman, Jean, Bevernage, Evert, Moelans, Pieter, Eeckhout, Walter, Glineur, Jean-Luc

Augmenting train maintenance technicians with automated incident diagnostic suggestions

arXiv.org Machine LearningAug-19-2024

Train operational incidents are so far diagnosed individually and manually by train maintenance technicians. In order to assist maintenance crews in their responsiveness and task prioritization, a learning machine is developed and deployed in production to suggest diagnostics to train technicians on their phones, tablets or laptops as soon as a train incident is declared. A feedback loop allows to take into account the actual diagnose by designated train maintenance experts to refine the learning machine. By formulating the problem as a discrete set classification task, feature engineering methods are proposed to extract physically plausible sets of events from traces generated on-board railway vehicles. The latter feed an original ensemble classifier to class incidents by their potential technical cause. Finally, the resulting model is trained and validated using real operational data and deployed on a cloud platform. Future work will explore how the extracted sets of events can be used to avoid incidents by assisting human experts in the creation predictive maintenance alerts.

classifier, incident, train maintenance technician, (14 more...)

arXiv.org Machine Learning

2408.10288

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Spain > Galicia > A Coruña Province > Santiago de Compostela (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report (0.50)

Industry: Transportation > Ground > Rail (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)