AITopics | Wang, Che

Collaborating Authors

Wang, Che

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Knowledge Graph Construction in Power Distribution Networks

Li, Xiang, Wang, Che, Li, Bing, Chen, Hao, Li, Sizhe

arXiv.org Artificial IntelligenceJan-27-2024

In this paper, we propose a method for knowledge graph construction in power distribution networks. This method leverages entity features, which involve their semantic, phonetic, and syntactic characteristics, in both the knowledge graph of distribution network and the dispatching texts. An enhanced model based on Convolutional Neural Network, is utilized for effectively matching dispatch text entities with those in the knowledge graph. The effectiveness of this model is evaluated through experiments in real-world power distribution dispatch scenarios. The results indicate that, compared with the baselines, the proposed model excels in linking a variety of entity types, demonstrating high overall accuracy in power distribution knowledge graph construction task.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2311.08724

Country: Asia > China (0.94)

Genre: Research Report (0.64)

Industry: Energy > Power Industry (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Dynamic Fault Characteristics Evaluation in Power Grid

Pei, Hao, Lin, Si, Li, Chuanfu, Wang, Che, Chen, Haoming, Li, Sizhe

arXiv.org Artificial IntelligenceJan-27-2024

To enhance the intelligence degree in operation and maintenance, a novel method for fault detection in power grids is proposed. The proposed GNN-based approach first identifies fault nodes through a specialized feature extraction method coupled with a knowledge graph. By incorporating temporal data, the method leverages the status of nodes from preceding and subsequent time periods to help current fault detection. To validate the effectiveness of the node features, a correlation analysis of the output features from each node was conducted. The results from experiments show that this method can accurately locate fault nodes in simulation scenarios with a remarkable accuracy. Additionally, the graph neural network based feature modeling allows for a qualitative examination of how faults spread across nodes, which provides valuable insights for analyzing fault nodes.

artificial intelligence, machine learning, node, (13 more...)

arXiv.org Artificial Intelligence

2311.16522

Country: Asia > China (0.47)

Genre: Research Report (0.70)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Pre-training with Synthetic Data Helps Offline Reinforcement Learning

Wang, Zecheng, Wang, Che, Dong, Zixuan, Ross, Keith

arXiv.org Artificial IntelligenceOct-5-2023

Recently, it has been shown that for offline deep reinforcement learning (DRL), pre-training Decision Transformer with a large language corpus can improve downstream performance (Reid et al., 2022). A natural question to ask is whether this performance gain can only be achieved with language pre-training, or can be achieved with simpler pre-training schemes which do not involve language. In this paper, we first show that language is not essential for improved performance, and indeed pre-training with synthetic IID data for a small number of updates can match the performance gains from pre-training with a large language corpus; moreover, pre-training with data generated by a one-step Markov chain can further improve the performance. Inspired by these experimental results, we then consider pre-training Conservative Q-Learning (CQL), a popular offline DRL algorithm, which is Q-learning-based and typically employs a Multi-Layer Perceptron (MLP) backbone. Surprisingly, pre-training with simple synthetic data for a small number of updates can also improve CQL, providing consistent performance improvement on D4RL Gym locomotion datasets. The results of this paper not only illustrate the importance of pre-training for offline DRL but also show that the pre-training data can be synthetic and generated with remarkably simple mechanisms. It is well-known that pre-training can provide significant boosts in performance and robustness for downstream tasks, both for Natural Language Processing (NLP) and Computer Vision (CV). Recently, in the field of Deep Reinforcement Learning (DRL), research on pre-training is also becoming increasingly popular. An important step in the direction of pre-training DRL models is the recent paper by Reid et al. (2022), which showed that for Decision Transformer (Chen et al., 2021), pretraining with the Wikipedia corpus can significantly improve the performance of the downstream offline RL task. Reid et al. (2022) further showed that pre-training on predicting pixel sequences can hurt performance. The authors state that their results indicate "a foreseeable future where everyone should use a pre-trained language model for offline RL".

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2310.00771

Country: Asia (0.14)

Genre: Research Report (0.64)

Add feedback

VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning

Wang, Che, Luo, Xufang, Ross, Keith, Li, Dongsheng

arXiv.org Artificial IntelligenceMar-31-2023

We propose VRL3, a powerful data-driven framework with a simple design for solving challenging visual deep reinforcement learning (DRL) tasks. We analyze a number of major obstacles in taking a data-driven approach, and present a suite of design principles, novel findings, and critical insights about data-driven visual DRL. Our framework has three stages: in stage 1, we leverage non-RL datasets (e.g. ImageNet) to learn task-agnostic visual representations; in stage 2, we use offline RL data (e.g. a limited number of expert demonstrations) to convert the task-agnostic representations into more powerful task-specific representations; in stage 3, we fine-tune the agent with online RL. On a set of challenging hand manipulation tasks with sparse reward and realistic visual inputs, compared to the previous SOTA, VRL3 achieves an average of 780% better sample efficiency. And on the hardest task, VRL3 is 1220% more sample efficient (2440% when using a wider encoder) and solves the task with only 10% of the computation. These significant results clearly demonstrate the great potential of data-driven deep reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2202.10324

Country:

North America > United States (0.14)
Asia > China (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance

Wu, Yanqiu, Chen, Xinyue, Wang, Che, Zhang, Yiming, Zhou, Zijian, Ross, Keith W.

arXiv.org Artificial IntelligenceNov-17-2021

Recently, Truncated Quantile Critics (TQC), using distributional representation of critics, was shown to provide state-of-the-art asymptotic training performance on all environments from the MuJoCo continuous control benchmark suite. Also recently, Randomized Ensemble Double Q-Learning (REDQ), using a high updateto-data ratio and target randomization, was shown to achieve high sample efficiency that is competitive with state-of-the-art model-based methods. In this paper, we propose a novel model-free algorithm, Aggressive Q-Learning with Ensembles (AQE), which improves the sample-efficiency performance of REDQ and the asymptotic performance of TQC, thereby providing overall state-of-the-art performance during all stages of training. Moreover, AQE is very simple, requiring neither distributional representation of critics nor target randomization. Off-policy Deep Reinforcement Learning algorithms aim to improve sample efficiency by reusing past experience. A number of off-policy Deep RL algorithms have been proposed for control tasks with continuous state and action spaces, including Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3) and Soft Actor Critic (SAC) (Lillicrap et al., 2016; Fujimoto et al., 2018; Haarnoja et al., 2018a;b). TD3 introduced clipped double-Q learning, and was shown to be significantly more sample efficient than popular on-policy methods for a wide range of MuJoCo benchmarks. Soft Actor Critic (SAC) has similar off-policy structures with clipped double-Q learning, but it also employs maximum entropy reinforcement learning. SAC was shown to provide excellent sample efficiency and asymptotic performance in a wide-range of MuJoCo environments, including the high-dimensional Humanoid environment for which both DDPG and TD3 perform poorly.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2111.09159

Country: North America > United States (0.16)

Genre: Research Report > Promising Solution (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Randomized Ensembled Double Q-Learning: Learning Fast Without a Model

Chen, Xinyue, Wang, Che, Zhou, Zijian, Ross, Keith

arXiv.org Artificial IntelligenceJan-15-2021

Using a high Update-To-Data (UTD) ratio, model-based methods have recently achieved much higher sample efficiency than previous model-free methods for continuous-action DRL benchmarks. In this paper, we introduce a simple model-free algorithm, Randomized Ensembled Double Q-Learning (REDQ), and show that its performance is just as good as, if not better than, a state-of-the-art model-based algorithm for the MuJoCo benchmark. Moreover, REDQ can achieve this performance using fewer parameters than the model-based method, and with less wall-clock run time. REDQ has three carefully integrated ingredients which allow it to achieve its high performance: (i) a UTD ratio >> 1; (ii) an ensemble of Q functions; (iii) in-target minimization across a random subset of Q functions from the ensemble. Through carefully designed experiments, we provide a detailed analysis of REDQ and related model-free algorithms. To our knowledge, REDQ is the first successful model-free DRL algorithm for continuous-action spaces using a UTD ratio >> 1.

artificial intelligence, neural network, redq, (16 more...)

arXiv.org Artificial Intelligence

2101.05982

Country: North America > United States (0.46)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning

Chen, Xinyue, Zhou, Zijian, Wang, Zheng, Wang, Che, Wu, Yanqiu, Deng, Qing, Ross, Keith

arXiv.org Artificial IntelligenceOct-27-2019

The field of Deep Reinforcement Learning (DRL) has recently seen a surge in research in batch reinforcement learning, which aims for sample-efficient learning from a given data set without additional interactions with the environment. In the batch DRL setting, commonly employed off-policy DRL algorithms can perform poorly and sometimes even fail to learn altogether. In this paper, we propose a new algorithm, Best-Action Imitation Learning (BAIL), which unlike many off-policy DRL algorithms does not involve maximizing Q functions over the action space. Striving for simplicity as well as performance, BAIL first selects from the batch the actions it believes to be high-performing actions for their corresponding states; it then uses those state-action pairs to train a policy network using imitation learning. Although BAIL is simple, we demonstrate that BAIL achieves state of the art performance on the Mujoco benchmark.

artificial intelligence, reinforcement learning, upper envelope, (15 more...)

arXiv.org Artificial Intelligence

1910.12179

Genre: Research Report (0.82)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning

Wang, Che, Wu, Yanqiu, Vuong, Quan, Ross, Keith

arXiv.org Artificial IntelligenceOct-10-2019

A BSTRACT The field of Deep Reinforcement Learning (DRL) has recently seen a surge in the popularity of maximum entropy reinforcement learning algorithms. Their popularity stems from the intuitive interpretation of the maximum entropy objective and their superior sample efficiency on standard benchmarks. In this paper, we seek to understand the primary contribution of the entropy term to the performance of maximum entropy algorithms. For the Mujoco benchmark, we demonstrate that the entropy term in Soft Actor Critic (SAC) principally addresses the bounded nature of the action spaces. With this insight, we propose a simple normalization scheme which allows a streamlined algorithm without entropy maximization match the performance of SAC. Our experimental results demonstrate a need to revisit the benefits of entropy regularization in DRL. We also propose a simple nonuniform sampling method for selecting transitions from the replay buffer during training. We further show that the streamlined algorithm with the simple nonuniform sampling scheme outperforms SAC and achieves state-of-the-art performance on challenging continuous control tasks. 1 I NTRODUCTION Off-policy deep Reinforcement Learning (RL) algorithms aim to improve sample efficiency by reusing past experience. Recently a number of new off-policy Deep Reinforcement Learning algorithms have been proposed for control tasks with continuous state and action spaces, including Deep Deterministic Policy Gradient (DDPG) and Twin Delayed DDPG (TD3) (Lillicrap et al., 2015; Fuji-moto et al., 2018). TD3, in particular, has been shown to be significantly more sample efficient than popular on-policy methods for a wide range of Mujoco benchmarks. The field of Deep Reinforcement Learning (DRL) has also recently seen a surge in the popularity of maximum entropy reinforcement learning algorithms. Their popularity stems from the intuitive interpretation of the maximum entropy objective and their superior sample efficiency on standard benchmarks.

algorithm, artificial intelligence, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

1910.02208

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Leisure & Entertainment (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past

Wang, Che, Ross, Keith

arXiv.org Artificial IntelligenceJun-10-2019

Soft Actor-Critic (SAC) [10, 11] is an off-policy actor-critic deep reinforcement learning (DRL) algorithm based on maximum entropy reinforcement learning. By combining off-policy updates with an actor-critic formulation, SAC achieves state-of-the-art performance on a range of continuous-action benchmark tasks, outperforming prior on-policy and off-policy methods. The off-policy method employed by SAC samples data uniformly from past experience when performing parameter updates. We propose Emphasizing Recent Experience (ERE), a simple but powerful off-policy sampling technique, which emphasizes recently observed data while not forgetting the past. The ERE algorithm samples more aggressively from recent experience, and also orders the updates to ensure that updates from old data do not overwrite updates from new data. We compare vanilla SAC and SAC ERE, and show that ERE is more sample efficient than vanilla SAC for continuous-action Mujoco tasks [31]. We also consider combining SAC with Priority Experience Replay (PER) [28], a scheme originally proposed for deep Q-learning which prioritizes the data based on temporal-difference (TD) error. We show that SAC PER can marginally improve the sample efficiency performance of SAC, but much less so than SAC ERE. Finally, we propose an algorithm which integrates ERE and PER and show that this hybrid algorithm can give the best results for some of the Mujoco tasks.

artificial intelligence, reinforcement learning, sac ere, (14 more...)

arXiv.org Artificial Intelligence

1906.04009

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Portfolio Online Evolution in StarCraft

Wang, Che (New York University) | Chen, Pan (New York University) | Li, Yuanda (New York University) | Holmgård, Christoffer (New York University) | Togelius, Julian (New York University)

AAAI ConferencesOct-4-2016

Portfolio Online Evolution is a novel method for playing real-time strategy games through evolutionary search in the space of assignments of scripts to individual game units. This method builds on and recombines two recently devised methods for playing multi-action games: (1) Portfolio Greedy Search, which searches in the space of heuristics assigned to units rather than in the space of actions, and (2) Online Evolution, which uses evolution rather than tree search to effectively play games where multiple actions per turn lead to enormous branching factors. The combination of both ideas lead to the use of evolution to search the space of which script/heuristic is assigned to which unit. In this paper, we introduce the ideas of Portfolio Online Evolution and apply it to StarCraft micro, or individual battles. It is shown to outperform all other tested methods in battles of moderate to large size.

portfolio online evolution, starcraft

AAAI Conferences

Twelfth Artificial Intelligence and Interactive Digital Entertainment Conference

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.53)
Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback