AITopics

2303.00141

Country:

North America > United States > Pennsylvania (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > Middle East > Jordan (0.04)
(6 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.88)
(3 more...)

arXiv.org Artificial IntelligenceMar-23-2023

Stochastic Graph Neural Network-based Value Decomposition for MARL in Internet of Vehicles

Xiao, Baidi, Li, Rongpeng, Wang, Fei, Peng, Chenghui, Wu, Jianjun, Zhao, Zhifeng, Zhang, Honggang

Autonomous driving has witnessed incredible advances in the past several decades, while Multi-Agent Reinforcement Learning (MARL) promises to satisfy the essential need of autonomous vehicle control in a wireless connected vehicle networks. In MARL, how to effectively decompose a global feedback into the relative contributions of individual agents belongs to one of the most fundamental problems. However, the environment volatility due to vehicle movement and wireless disturbance could significantly shape time-varying topological relationships among agents, thus making the Value Decomposition (VD) challenging. Therefore, in order to cope with this annoying volatility, it becomes imperative to design a dynamic VD framework. Hence, in this paper, we propose a novel Stochastic VMIX (SVMIX) methodology by taking account of dynamic topological features during the VD and incorporating the corresponding components into a multi-agent actor-critic architecture. In particular, Stochastic Graph Neural Network (SGNN) is leveraged to effectively capture underlying dynamics in topological features and improve the flexibility of VD against the environment volatility. Finally, the superiority of SVMIX is verified through extensive simulations.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

2303.13213

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York (0.04)
North America > Canada > Quebec > Montreal (0.04)
(8 more...)

Genre: Research Report (0.50)

Industry: Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Nazari, Farhad, Mohajer, Navid, Nahavandi, Darius, Khosravi, Abbas, Nahavandi, Saeid

Applied Exoskeleton Technology: A Comprehensive Review of Physical and Cognitive Human-Robot Interaction

Exoskeletons and orthoses are wearable mobile systems providing mechanical benefits to the users. Despite significant improvements in the last decades, the technology is not fully mature to be adopted for strenuous and non-programmed tasks. To accommodate this insufficiency, different aspects of this technology need to be analysed and improved. Numerous studies have tried to address some aspects of exoskeletons, e.g. mechanism design, intent prediction, and control scheme. However, most works have focused on a specific element of design or application without providing a comprehensive review framework. This study aims to analyse and survey the contributing aspects to this technology's improvement and broad adoption. To address this, after introducing assistive devices and exoskeletons, the main design criteria will be investigated from both physical Human-Robot Interaction (HRI) perspectives. In order to establish an intelligent HRI strategy and enable intuitive control for users, cognitive HRI will be investigated after a brief introduction to various approaches to their control strategies. The study will be further developed by outlining several examples of known assistive devices in different categories. And some guidelines for exoskeleton selection and possible mitigation of current limitations will be discussed.

artificial intelligence, machine learning, survey article, (21 more...)

doi: 10.1109/TCDS.2023.3241632

2111.1286

Country:

Asia > Middle East (0.28)
North America > United States (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.93)
Energy > Oil & Gas > Upstream (0.93)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Enhancing Unsupervised Speech Recognition with Diffusion GANs

Wu, Xianchao

We enhance the vanilla adversarial training method for unsupervised Automatic Speech Recognition (ASR) by a diffusion-GAN. Our model (1) injects instance noises of various intensities to the generator's output and unlabeled reference text which are sampled from pretrained phoneme language models with a length constraint, (2) asks diffusion timestep-dependent discriminators to separate them, and (3) back-propagates the gradients to update the generator. Word/phoneme error rate comparisons with wav2vec-U under Librispeech (3.1% for test-clean and 5.6% for test-other), TIMIT and MLS datasets, show that our enhancement strategies work effectively.

artificial intelligence, machine learning, natural language, (15 more...)

2303.13559

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Trimponias, George, Dietterich, Thomas G.

Reinforcement Learning with Exogenous States and Rewards

Exogenous state variables and rewards can slow reinforcement learning by injecting uncontrolled variation into the reward signal. This paper formalizes exogenous state variables and rewards and shows that if the reward function decomposes additively into endogenous and exogenous components, the MDP can be decomposed into an exogenous Markov Reward Process (based on the exogenous reward) and an endogenous Markov Decision Process (optimizing the endogenous reward). Any optimal policy for the endogenous MDP is also an optimal policy for the original MDP, but because the endogenous reward typically has reduced variance, the endogenous MDP is easier to solve. We study settings where the decomposition of the state space into exogenous and endogenous state spaces is not given but must be discovered. The paper introduces and proves correctness of algorithms for discovering the exogenous and endogenous subspaces of the state space when they are mixed through linear combination. These algorithms can be applied during reinforcement learning to discover the exogenous space, remove the exogenous reward, and focus reinforcement learning on the endogenous MDP. Experiments on a variety of challenging synthetic MDPs show that these methods, applied online, discover large exogenous state spaces and produce substantial speedups in reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2303.12957

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Telecommunications (0.45)
Government > Regional Government > North America Government > United States Government (0.45)
Leisure & Entertainment (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.87)

Communication Load Balancing via Efficient Inverse Reinforcement Learning

Konar, Abhisek, Wu, Di, Xu, Yi Tian, Jang, Seowoo, Liu, Steve, Dudek, Gregory

Communication load balancing aims to balance the load between different available resources, and thus improve the quality of service for network systems. After formulating the load balancing (LB) as a Markov decision process problem, reinforcement learning (RL) has recently proven effective in addressing the LB problem. To leverage the benefits of classical RL for load balancing, however, we need an explicit reward definition. Engineering this reward function is challenging, because it involves the need for expert knowledge and there lacks a general consensus on the form of an optimal reward function. In this work, we tackle the communication load balancing problem from an inverse reinforcement learning (IRL) approach. To the best of our knowledge, this is the first time IRL has been successfully applied in the field of communication load balancing. Specifically, first, we infer a reward function from a set of demonstrations, and then learn a reinforcement learning load balancing policy with the inferred reward function. Compared to classical RL-based solution, the proposed solution can be more general and more suitable for real-world scenarios. Experimental evaluations implemented on different simulated traffic scenarios have shown our method to be effective and better than other baselines by a considerable margin.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2303.16686

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.82)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

#artificialintelligenceMar-21-2023, 10:27:12 GMT

Best 10 Machine Learning Courses Online - Big Data Analytics News

Ready to build the future with Deep Neural Networks? Stand on the shoulder of TensorFlow and Keras for Machine Learning.

learning, machine learning, machine learning course online, (10 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Sutton, Richard S., Bowling, Michael, Pilarski, Patrick M.

The Alberta Plan for AI Research

arXiv.org Artificial IntelligenceMar-21-2023

The transition model is used to imagine possible outcomes of taking the action/option, which are then evaluated by the value functions to change the policies and the value functions themselves. This process is called planning. Planning, like everything else in the architecture, is expected to be continual and temporally uniform. On every step there will be some amount of planning, perhaps a series of small planning steps, but planning would typically not be complete in a single time step and thus would be slow compared to the speed of agent-environment interaction. Planning is an ongoing process that operates asynchronously, in the background, whenever it can be done without interfering with the first three components, all of which must operate on every time step and are said to run in the foreground.

artificial intelligence, machine learning, planning & scheduling, (17 more...)

2208.11173

Country:

North America > Canada > Alberta (0.53)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(5 more...)

Genre: Workflow (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.94)
Information Technology > Artificial Intelligence > Cognitive Science (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Ronecker, Max Peter, Zhu, Yuan

Deep Q-Network Based Decision Making for Autonomous Driving

arXiv.org Artificial IntelligenceMar-21-2023

Currently decision making is one of the biggest challenges in autonomous driving. This paper introduces a method for safely navigating an autonomous vehicle in highway scenarios by combining deep Q-Networks and insight from control theory. A Deep Q-Network is trained in simulation to serve as a central decision-making unit by proposing targets for a trajectory planner. The generated trajectories in combination with a controller for longitudinal movement are used to execute lane change maneuvers. In order to prove the functionality of this approach it is evaluated on two different highway traffic scenarios. Furthermore, the impact of different state representations on the performance and training process is analyzed. The results show that the proposed system can produce efficient and safe driving behavior.

artificial intelligence, deep learning, machine learning, (18 more...)

doi: 10.1109/ICRAS.2019.8808950

2303.11634

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > New York > Richmond County > New York City (0.04)
North America > United States > New York > Queens County > New York City (0.04)
(7 more...)

Genre: Research Report (0.70)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Huang, Ruiquan, Yang, Jing, Liang, Yingbin

Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-free RL

arXiv.org Artificial IntelligenceMar-21-2023

Reward-free reinforcement learning (RF-RL), a recently introduced RL paradigm, relies on random action-taking to explore the unknown environment without any reward feedback information. While the primary goal of the exploration phase in RF-RL is to reduce the uncertainty in the estimated model with minimum number of trajectories, in practice, the agent often needs to abide by certain safety constraint at the same time. It remains unclear how such safe exploration requirement would affect the corresponding sample complexity in order to achieve the desired optimality of the obtained policy in planning. In this work, we make a first attempt to answer this question. In particular, we consider the scenario where a safe baseline policy is known beforehand, and propose a unified Safe reWard-frEe ExploraTion (SWEET) framework. We then particularize the SWEET framework to the tabular and the low-rank MDP settings, and develop algorithms coined Tabular-SWEET and Low-rank-SWEET, respectively. Both algorithms leverage the concavity and continuity of the newly introduced truncated value functions, and are guaranteed to achieve zero constraint violation during exploration with high probability. Furthermore, both algorithms can provably find a near-optimal policy subject to any constraint in the planning phase. Remarkably, the sample complexities under both algorithms match or even outperform the state of the art in their constraint-free counterparts up to some constant factors, proving that safety constraint hardly increases the sample complexity for RF-RL.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

2206.14057

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Ohio (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.63)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)