AITopics | Lai, Kwei-Herng

Collaborating Authors

Lai, Kwei-Herng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Policy-GNN: Aggregation Optimization for Graph Neural Networks

Lai, Kwei-Herng, Zha, Daochen, Zhou, Kaixiong, Hu, Xia

arXiv.org Machine LearningJun-26-2020

Graph data are pervasive in many real-world applications. Recently, increasing attention has been paid on graph neural networks (GNNs), which aim to model the local graph structures and capture the hierarchical patterns by aggregating the information from neighbors with stackable network modules. Motivated by the observation that different nodes often require different iterations of aggregation to fully capture the structural information, in this paper, we propose to explicitly sample diverse iterations of aggregation for different nodes to boost the performance of GNNs. It is a challenging task to develop an effective aggregation strategy for each node, given complex graphs and sparse features. Moreover, it is not straightforward to derive an efficient algorithm since we need to feed the sampled nodes into different number of network layers. To address the above challenges, we propose Policy-GNN, a meta-policy framework that models the sampling procedure and message passing of GNNs into a combined learning process. Specifically, Policy-GNN uses a meta-policy to adaptively determine the number of aggregations for each node. The meta-policy is trained with deep reinforcement learning (RL) by exploiting the feedback from the model. We further introduce parameter sharing and a buffer mechanism to boost the training efficiency. Experimental results on three real-world benchmark datasets suggest that Policy-GNN significantly outperforms the state-of-the-art alternatives, showing the promise in aggregation optimization for GNNs.

deep learning, neural network, node, (18 more...)

arXiv.org Machine Learning

2006.15097

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dual Policy Distillation

Lai, Kwei-Herng, Zha, Daochen, Li, Yuening, Hu, Xia

arXiv.org Artificial IntelligenceJun-7-2020

Policy distillation, which transfers a teacher policy to a student policy has achieved great success in challenging tasks of deep reinforcement learning. This teacher-student framework requires a well-trained teacher model which is computationally expensive. Moreover, the performance of the student model could be limited by the teacher model if the teacher model is not optimal. In the light of collaborative learning, we study the feasibility of involving joint intellectual efforts from diverse perspectives of student models. In this work, we introduce dual policy distillation(DPD), a student-student framework in which two learners operate on the same environment to explore different perspectives of the environment and extract knowledge from each other to enhance their learning. The key challenge in developing this dual learning framework is to identify the beneficial knowledge from the peer learner for contemporary learning-based reinforcement learning algorithms, since it is unclear whether the knowledge distilled from an imperfect and noisy peer learner would be helpful. To address the challenge, we theoretically justify that distilling knowledge from a peer learner will lead to policy improvement and propose a disadvantageous distillation strategy based on the theoretical results. The conducted experiments on several continuous control tasks show that the proposed framework achieves superior performance with a learning-based agent and function approximation without the use of expensive teacher models.

artificial intelligence, distillation, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2006.04061

Country: North America > United States > Texas (0.14)

Genre: Research Report (0.64)

Industry:

Education (1.00)
Leisure & Entertainment > Games (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

RLCard: A Toolkit for Reinforcement Learning in Card Games

Zha, Daochen, Lai, Kwei-Herng, Cao, Yuanpu, Huang, Songyi, Wei, Ruzhe, Guo, Junyu, Hu, Xia

arXiv.org Artificial IntelligenceOct-10-2019

RLCard is an open-source toolkit for reinforcement learning research in card games. It supports various card environments with easy-to-use interfaces, including Blackjack, Leduc Hold'em, Texas Hold'em, UNO, Dou Dizhu and Mahjong. The goal of RLCard is to bridge reinforcement learning and imperfect information games, and push forward the research of reinforcement learning in domains with multiple agents, large state and action space, and sparse reward. In this paper, we provide an overview of the key components in RLCard, a discussion of the design principles, a brief introduction of the interfaces, and comprehensive evaluations of the environments.

artificial intelligence, computer game, survey article, (18 more...)

arXiv.org Artificial Intelligence

1910.04376

Country: North America > United States > Texas (0.37)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Experience Replay Optimization

Zha, Daochen, Lai, Kwei-Herng, Zhou, Kaixiong, Hu, Xia

arXiv.org Machine LearningJun-19-2019

Experience replay enables reinforcement learning agents to memorize and reuse past experiences, just as humans replay memories for the situation at hand. Contemporary off-policy algorithms either replay past experiences uniformly or utilize a rule-based replay strategy, which may be sub-optimal. In this work, we consider learning a replay policy to optimize the cumulative reward. Replay learning is challenging because the replay memory is noisy and large, and the cumulative reward is unstable. To address these issues, we propose a novel experience replay optimization (ERO) framework which alternately updates two policies: the agent policy, and the replay policy. The agent is updated to maximize the cumulative reward based on the replayed data, while the replay policy is updated to provide the agent with the most useful experiences. The conducted experiments on various continuous control tasks demonstrate the effectiveness of ERO, empirically showing promise in experience replay learning to improve the performance of off-policy reinforcement learning algorithms.

experience replay optimization

arXiv.org Machine Learning

1906.08387

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Superhighway: Bypass Data Sparsity in Cross-Domain CF

Lai, Kwei-Herng, Wang, Ting-Hsiang, Chi, Heng-Yu, Chen, Yian, Tsai, Ming-Feng, Wang, Chuan-Ju

arXiv.org Machine LearningAug-28-2018

Cross-domain collaborative filtering (CF) aims to alleviate data sparsity in single-domain CF by leveraging knowledge transferred from related domains. Many traditional methods focus on enriching compared neighborhood relations in CF directly to address the sparsity problem. In this paper, we propose superhighway construction, an alternative explicit relation-enrichment procedure, to improve recommendations by enhancing cross-domain connectivity. Specifically, assuming partially overlapped items (users), superhighway bypasses multi-hop inter-domain paths between cross-domain users (items, respectively) with direct paths to enrich the cross-domain connectivity. The experiments conducted on a real-world cross-region music dataset and a cross-platform movie dataset show that the proposed superhighway construction significantly improves recommendation performance in both target and source domains.

artificial intelligence, machine learning, superhighway, (17 more...)

arXiv.org Machine Learning

1808.09784

Country: North America > Canada (0.16)

Genre: Research Report (0.40)

Industry: Media (0.31)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.72)
Information Technology > Artificial Intelligence > Machine Learning (0.72)

Add feedback