AITopics

2209.09624

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

arXiv.org Artificial IntelligenceSep-17-2022

Sub-optimal Policy Aided Multi-Agent Reinforcement Learning for Flocking Control

Qiu, Yunbo, Jin, Yue, Wang, Jian, Zhang, Xudong

Flocking control is a challenging problem, where multiple agents, such as drones or vehicles, need to reach a target position while maintaining the flock and avoiding collisions with obstacles and collisions among agents in the environment. Multi-agent reinforcement learning has achieved promising performance in flocking control. However, methods based on traditional reinforcement learning require a considerable number of interactions between agents and the environment. This paper proposes a sub-optimal policy aided multi-agent reinforcement learning algorithm (SPA-MARL) to boost sample efficiency. SPA-MARL directly leverages a prior policy that can be manually designed or solved with a non-learning method to aid agents in learning, where the performance of the policy can be sub-optimal. SPA-MARL recognizes the difference in performance between the sub-optimal policy and itself, and then imitates the sub-optimal policy if the sub-optimal policy is better. We leverage SPA-MARL to solve the flocking control problem. A traditional control method based on artificial potential fields is used to generate a sub-optimal policy. Experiments demonstrate that SPA-MARL can speed up the training process and outperform both the MARL baseline and the used sub-optimal policy.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2209.08347

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceSep-17-2022

Sample-Efficient Multi-Agent Reinforcement Learning with Demonstrations for Flocking Control

Qiu, Yunbo, Zhan, Yuzhu, Jin, Yue, Wang, Jian, Zhang, Xudong

Flocking control is a significant problem in multi-agent systems such as multi-agent unmanned aerial vehicles and multi-agent autonomous underwater vehicles, which enhances the cooperativity and safety of agents. In contrast to traditional methods, multi-agent reinforcement learning (MARL) solves the problem of flocking control more flexibly. However, methods based on MARL suffer from sample inefficiency, since they require a huge number of experiences to be collected from interactions between agents and the environment. We propose a novel method Pretraining with Demonstrations for MARL (PwD-MARL), which can utilize non-expert demonstrations collected in advance with traditional methods to pretrain agents. During the process of pretraining, agents learn policies from demonstrations by MARL and behavior cloning simultaneously, and are prevented from overfitting demonstrations. By pretraining with non-expert demonstrations, PwD-MARL improves sample efficiency in the process of online MARL with a warm start. Experiments show that PwD-MARL improves sample efficiency and policy performance in the problem of flocking control, even with bad or few demonstrations.

demonstration, machine learning, reinforcement learning, (15 more...)

2209.08351

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (1.00)

Industry: Aerospace & Defense > Aircraft (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.34)

#artificialintelligenceSep-16-2022, 14:17:18 GMT

GitHub - xavierpuigf/virtualhome: API to run VirtualHome, a Multi-Agent Household Simulator

VirtualHome is an interactive platform to simulate complex household activities via programs. Key aspect of VirtualHome is that it allows complex interactions with the environment, such as picking up objects, switching on/off appliances, opening appliances, etc. Our simulator can easily be called with a Python API: write the activity as a simple sequence of instructions which then get rendered in VirtualHome. You can choose between different agents and environments, as well as modify environments on the fly. The platform allows to simulate multi-agent activities and can serve as an environment to train agents fro embodied AI tasks.

graph, simulator, virtualhome, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Navsalkar, Atharva, Hota, Ashish R.

Data-Driven Risk-sensitive Model Predictive Control for Safe Navigation in Multi-Robot Systems

Safe navigation is a fundamental challenge in multi-robot systems due to the uncertainty surrounding the future trajectory of the robots that act as obstacles for each other. In this work, we propose a principled data-driven approach where each robot repeatedly solves a finite horizon optimization problem subject to collision avoidance constraints with latter being formulated as distributionally robust conditional value-at-risk (CVaR) of the distance between the agent and a polyhedral obstacle geometry. Specifically, the CVaR constraints are required to hold for all distributions that are close to the empirical distribution constructed from observed samples of prediction error collected during execution. The generality of the approach allows us to robustify against prediction errors that arise under commonly imposed assumptions in both distributed and decentralized settings. We derive tractable finite-dimensional approximations of this class of constraints by leveraging convex and minmax duality results for Wasserstein distributionally robust optimization problems. The effectiveness of the proposed approach is illustrated in a multi-drone navigation setting implemented in Gazebo platform.

artificial intelligence, constraint, optimization problem, (13 more...)

2209.07793

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Mirzakhani, Golnaz, Ghanbari-Adivi, Elham, Fattahi, Rohollah, Ehteram, Mohammad, Mosavi, Amir, Ahmed, Ali Najah, El-Shafieg, Ahmed

Application of Group Method of Data Handling and New Optimization Algorithms for Predicting Sediment Transport Rate under Vegetation Cover

Planting vegetation is one of the practical solutions for reducing sediment transfer rates. Increasing vegetation cover decreases environmental pollution and sediment transport rate (STR). Since sediments and vegetation interact complexly, predicting sediment transport rates is challenging. This study aims to predict sediment transport rate under vegetation cover using new and optimized versions of the group method of data handling (GMDH). Additionally, this study introduces a new ensemble model for predicting sediment transport rates. Model inputs include wave height, wave velocity, density cover, wave force, D50, the height of vegetation cover, and cover stem diameter. A standalone GMDH model and optimized GMDH models, including GMDH honey badger algorithm (HBA) GMDH rat swarm algorithm (RSOA)vGMDH sine cosine algorithm (SCA), and GMDH particle swarm optimization (GMDH-PSO), were used to predict sediment transport rates. As the next step, the outputs of standalone and optimized GMDH were used to construct an ensemble model. The MAE of the ensemble model was 0.145 m3/s, while the MAEs of GMDH-HBA, GMDH-RSOA, GMDH-SCA, GMDH-PSOA, and GMDH in the testing level were 0.176 m3/s, 0.312 m3/s, 0.367 m3/s, 0.498 m3/s, and 0.612 m3/s, respectively. The Nash Sutcliffe coefficient (NSE) of ensemble model, GMDH-HBA, GMDH-RSOA, GMDH-SCA, GMDH-PSOA, and GHMDH were 0.95 0.93, 0.89, 0.86, 0.82, and 0.76, respectively. Additionally, this study demonstrated that vegetation cover decreased sediment transport rate by 90 percent. The results indicated that the ensemble and GMDH-HBA models could accurately predict sediment transport rates. Based on the results of this study, sediment transport rate can be monitored using the IMM and GMDH-HBA. These results are useful for managing and planning water resources in large basins.

artificial intelligence, evolutionary algorithm, machine learning, (14 more...)

2209.09623

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Yang, Li, Shami, Abdallah

IoT Data Analytics in Dynamic Environments: From An Automated Machine Learning Perspective

With the wide spread of sensors and smart devices in recent years, the data generation speed of the Internet of Things (IoT) systems has increased dramatically. In IoT systems, massive volumes of data must be processed, transformed, and analyzed on a frequent basis to enable various IoT services and functionalities. Machine Learning (ML) approaches have shown their capacity for IoT data analytics. However, applying ML models to IoT data analytics tasks still faces many difficulties and challenges, specifically, effective model selection, design/tuning, and updating, which have brought massive demand for experienced data scientists. Additionally, the dynamic nature of IoT data may introduce concept drift issues, causing model performance degradation. To reduce human efforts, Automated Machine Learning (AutoML) has become a popular field that aims to automatically select, construct, tune, and update machine learning models to achieve the best performance on specified tasks. In this paper, we conduct a review of existing methods in the model selection, tuning, and updating procedures in the area of AutoML in order to identify and summarize the optimal solutions for every step of applying ML algorithms to IoT data analytics. To justify our findings and help industrial users and researchers better implement AutoML approaches, a case study of applying AutoML to IoT anomaly detection problems is conducted in this work. Lastly, we discuss and classify the challenges and research directions for this domain.

evolutionary algorithm, machine learning, reinforcement learning, (18 more...)

doi: 10.1016/j.engappai.2022.105366

2209.08018

Country:

North America > Canada > Ontario > Middlesex County > London (0.27)
North America > United States > New York > New York County > New York City (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
(10 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.87)

Industry:

Information Technology > Smart Houses & Appliances (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(4 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(8 more...)

Evolutionary Action Selection for Gradient-based Policy Learning

Ma, Yan, Liu, Tianxing, Wei, Bingsheng, Liu, Yi, Xu, Kang, Li, Wei

Evolutionary Algorithms (EAs) and Deep Reinforcement Learning (DRL) have recently been integrated to take the advantage of the both methods for better exploration and exploitation. The evolutionary part in these hybrid methods maintains a population of policy networks. However, existing methods focus on optimizing the parameters of policy network, which is usually high-dimensional and tricky for EA. In this paper, we shift the target of evolution from high-dimensional parameter space to low-dimensional action space. We propose Evolutionary Action Selection-Twin Delayed Deep Deterministic Policy Gradient (EAS-TD3), a novel hybrid method of EA and DRL. In EAS, we focus on optimizing the action chosen by the policy network and attempt to obtain high-quality actions to promote policy learning through an evolutionary algorithm. We conduct several experiments on challenging continuous control tasks. The result shows that EAS-TD3 shows superior performance over other state-of-art methods.

evolutionary algorithm, machine learning, reinforcement learning, (13 more...)

2201.04286

Country:

Asia > China > Shanghai > Shanghai (0.05)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

ASIR: Robust Agent-based Representation Of SIR Model

Xu, Boyan

But in the literature there lacks discussion on how to build the quantitative relationship between them. In this paper, we propose an agent-based SIR model: ASIR. ASIR can robustly reproduce the infection curve predicted by a given SIR model (the simplest CM.) Notably, one can deduce any parameter of ASIR from parameters of SIR without manual tuning. ASIR offers epidemiologists a method to transform a calibrated SIR model into an agent-based model that inherit SIR's performance without another round of calibration. The design ASIR is inspirational for building a general quantitative relationship between CM and AM.

agent, artificial intelligence, asir, (14 more...)

2209.08214

Country:

Asia > China > Shanghai > Shanghai (0.05)
Asia > India (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine > Epidemiology (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Truthful Generalized Linear Models

Qiu, Yuan, Liu, Jinyan, Wang, Di

In this paper we study estimating Generalized Linear Models (GLMs) in the case where the agents (individuals) are strategic or self-interested and they concern about their privacy when reporting data. Compared with the classical setting, here we aim to design mechanisms that can both incentivize most agents to truthfully report their data and preserve the privacy of individuals' reports, while their outputs should also close to the underlying parameter. In the first part of the paper, we consider the case where the covariates are sub-Gaussian and the responses are heavy-tailed where they only have the finite fourth moments. First, motivated by the stationary condition of the maximizer of the likelihood function, we derive a novel private and closed form estimator. Based on the estimator, we propose a mechanism which has the following properties via some appropriate design of the computation and payment scheme for several canonical models such as linear regression, logistic regression and Poisson regression: (1) the mechanism is $o(1)$-jointly differentially private (with probability at least $1-o(1)$); (2) it is an $o(\frac{1}{n})$-approximate Bayes Nash equilibrium for a $(1-o(1))$-fraction of agents to truthfully report their data, where $n$ is the number of agents; (3) the output could achieve an error of $o(1)$ to the underlying parameter; (4) it is individually rational for a $(1-o(1))$ fraction of agents in the mechanism ; (5) the payment budget required from the analyst to run the mechanism is $o(1)$. In the second part, we consider the linear regression model under more general setting where both covariates and responses are heavy-tailed and only have finite fourth moments. By using an $\ell_4$-norm shrinkage operator, we propose a private estimator and payment scheme which have similar properties as in the sub-Gaussian case.

agent, artificial intelligence, machine learning, (18 more...)

2209.07815

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)