AITopics | Liu, Changliu

Collaborating Authors

Liu, Changliu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Online Adaptation of Neural Network Models by Modified Extended Kalman Filter for Customizable and Transferable Driving Behavior Prediction

Wang, Letian, Hu, Yeping, Liu, Changliu

arXiv.org Artificial IntelligenceDec-9-2021

High fidelity behavior prediction of human drivers is crucial for efficient and safe deployment of autonomous vehicles, which is challenging due to the stochasticity, heterogeneity, and time-varying nature of human behaviors. On one hand, the trained prediction model can only capture the motion pattern in an average sense, while the nuances among individuals can hardly be reflected. On the other hand, the prediction model trained on the training set may not generalize to the testing set which may be in a different scenario or data distribution, resulting in low transferability and generalizability. In this paper, we applied a $\tau$-step modified Extended Kalman Filter parameter adaptation algorithm (MEKF$_\lambda$) to the driving behavior prediction task, which has not been studied before in literature. With the feedback of the observed trajectory, the algorithm is applied to neural-network-based models to improve the performance of driving behavior predictions across different human subjects and scenarios. A new set of metrics is proposed for systematic evaluation of online adaptation performance in reducing the prediction error for different individuals and scenarios. Empirical studies on the best layer in the model and steps of observation to adapt are also provided.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Artificial Intelligence

2112.06129

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.64)

Industry: Transportation > Ground > Road (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learn Zero-Constraint-Violation Policy in Model-Free Constrained Reinforcement Learning

Ma, Haitong, Liu, Changliu, Li, Shengbo Eben, Zheng, Sifa, Sun, Wenchao, Chen, Jianyu

arXiv.org Artificial IntelligenceNov-25-2021

--In the trial-and-error mechanism of reinforcement learning (RL), a notorious contradiction arises when we expect to learn a safe policy: how to learn a safe policy without enough data and prior model about the dangerous region? Existing methods mostly use the posterior penalty for dangerous actions, which means that the agent is not penalized until experiencing danger . This fact causes that the agent cannot learn a zero-violation policy even after convergence . Otherwise, it would not receive any penalty and lose the knowledge about danger . In this paper, we propose the safe set actor-critic (SSAC) algorithm, which confines the policy update using safety-oriented energy functions, or the safety indexes . The safety index is designed to increase rapidly for potentially dangerous actions, which allow us to locate the safe set on the action space, or the control safe set . Therefore, we can identify the dangerous actions prior to taking them, and further obtain a zero constraint-violation policy after convergence. We claim that we can learn the energy function in a model-free manner similar to learning a value function. By using the energy function transition as the constraint objective, we formulate a constrained RL problem. We prove that our Lagrangian-based solutions make sure that the learned policy will converge to the constrained optimum under some assumptions. The proposed algorithm is evaluated on both the complex simulation environments and a hardware-in-loop (HIL) experiment with a real controller from the autonomous vehicle. Experimental results suggest that the converged policy in all environments achieve zero constraint violation and comparable performance with model-based baseline. EINFORCEMENT learning has drawn rapidly growing attention for its superhuman learning capabilities in many sequential decision making problems like Go [1], Atari Games [2], and Starcraft [3].

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2111.12953

Country:

Europe (0.67)
Asia > China (0.28)
Oceania > Australia (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
(2 more...)

Add feedback

A Microscopic Pandemic Simulator for Pandemic Prediction Using Scalable Million-Agent Reinforcement Learning

Tang, Zhenggang, Yan, Kai, Sun, Liting, Zhan, Wei, Liu, Changliu

arXiv.org Artificial IntelligenceAug-14-2021

Microscopic epidemic models are powerful tools for government policy makers to predict and simulate epidemic outbreaks, which can capture the impact of individual behaviors on the macroscopic phenomenon. However, existing models only consider simple rule-based individual behaviors, limiting their applicability. This paper proposes a deep-reinforcement-learning-powered microscopic model named Microscopic Pandemic Simulator (MPS). By replacing rule-based agents with rational agents whose behaviors are driven to maximize rewards, the MPS provides a better approximation of real world dynamics. To efficiently simulate with massive amounts of agents in MPS, we propose Scalable Million-Agent DQN (SMADQN). The MPS allows us to efficiently evaluate the impact of different government strategies. This paper first calibrates the MPS against real-world data in Allegheny, US, then demonstratively evaluates two government strategies: information disclosure and quarantine. The results validate the effectiveness of the proposed method. As a broad impact, this paper provides novel insights for the application of DRL in large scale agent-based networks such as economic and social networks.

agent, health & medicine, immunology, (18 more...)

arXiv.org Artificial Intelligence

2108.06589

Country:

North America > United States > Pennsylvania (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Tolerance-Guided Policy Learning for Adaptable and Transferrable Delicate Industrial Insertion

Niu, Boshen, Wang, Chenxi, Liu, Changliu

arXiv.org Artificial IntelligenceAug-4-2021

Policy learning for delicate industrial insertion tasks (e.g., PC board assembly) is challenging. This paper considers two major problems: how to learn a diversified policy (instead of just one average policy) that can efficiently handle different workpieces with minimum amount of training data, and how to handle defects of workpieces during insertion. To address the problems, we propose tolerance-guided policy learning. To encourage transferability of the learned policy to different workpieces, we add a task embedding to the policy's input space using the insertion tolerance. Then we train the policy using generative adversarial imitation learning with reward shaping (RS-GAIL) on a variety of representative situations. To encourage adaptability of the learned policy to handle defects, we build a probabilistic inference model that can output the best inserting pose based on failed insertions using the tolerance model. The best inserting pose is then used as a reference to the learned policy. This proposed method is validated on a sequence of IC socket insertion tasks in simulation. The results show that 1) RS-GAIL can efficiently learn optimal policies under sparse rewards; 2) the tolerance embedding can enhance the transferability of the learned policy; 3) the probabilistic inference makes the policy robust to defects on the workpieces.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2108.02303

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Algorithms for Verifying Deep Neural Networks

Liu, Changliu, Arnon, Tomer, Lazarus, Christopher, Barrett, Clark, Kochenderfer, Mykel J.

arXiv.org Machine LearningMar-15-2019

Neural networks [15] have been widely used in many applications, such as image classification and understanding [17], language processing [24], and control of autonomous systems [26]. These networks represent functions that map inputs to outputs through a sequence of layers. At each layer, the input to that layer undergoes an affine transformation followed by a simple nonlinear transformation before being passed to the next layer. These nonlinear transformations are often called activation functions, and a common example is the rectified linear unit (ReLU), which transforms the input by setting any negative values to zero. Although the computation involved in a neural network is quite simple, these networks can represent complex nonlinear functions by appropriately choosing the matrices that define the affine transformations.

algorithm, deep learning, neural network, (22 more...)

arXiv.org Machine Learning

1903.06758

Country:

North America > United States > California > Santa Clara County (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

Add feedback

Simulating Emergent Properties of Human Driving Behavior Using Multi-Agent Reward Augmented Imitation Learning

Bhattacharyya, Raunak P., Phillips, Derek J., Liu, Changliu, Gupta, Jayesh K., Driggs-Campbell, Katherine, Kochenderfer, Mykel J.

arXiv.org Artificial IntelligenceMar-13-2019

Recent developments in multi-agent imitation learning have shown promising results for modeling the behavior of human drivers. However, it is challenging to capture emergent traffic behaviors that are observed in real-world datasets. Such behaviors arise due to the many local interactions between agents that are not commonly accounted for in imitation learning. This paper proposes Reward Augmented Imitation Learning (RAIL), which integrates reward augmentation into the multi-agent imitation learning framework and allows the designer to specify prior knowledge in a principled fashion. We prove that convergence guarantees for the imitation learning process are preserved under the application of reward augmentation. This method is validated in a driving scenario, where an entire traffic scene is controlled by driving policies learned using our proposed algorithm. Further, we demonstrate improved performance in comparison to traditional imitation learning algorithms both in terms of the local actions of a single agent and the behavior of emergent properties in complex, multi-agent settings.

ground transportation, imitation learning, survey article, (18 more...)

arXiv.org Artificial Intelligence

1903.05766

Country:

North America > United States > Illinois (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Automobiles & Trucks (1.00)
Transportation > Ground > Road (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback