AITopics | Guo, Jian

Collaborating Authors

Guo, Jian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning

Liu, Xiao-Yang, Li, Zechu, Yang, Zhuoran, Zheng, Jiahao, Wang, Zhaoran, Walid, Anwar, Guo, Jian, Jordan, Michael I.

arXiv.org Artificial IntelligenceDec-11-2021

Deep reinforcement learning (DRL) has revolutionized learning and actuation in applications such as game playing and robotic control. The cost of data collection, i.e., generating transitions from agent-environment interactions, remains a major challenge for wider DRL adoption in complex real-world problems. Following a cloud-native paradigm to train DRL agents on a GPU cloud platform is a promising solution. In this paper, we present a scalable and elastic library ElegantRL-podracer for cloud-native deep reinforcement learning, which efficiently supports millions of GPU cores to carry out massively parallel training at multiple levels. At a high-level, ElegantRL-podracer employs a tournament-based ensemble scheme to orchestrate the training process on hundreds or even thousands of GPUs, scheduling the interactions between a leaderboard and a training pool with hundreds of pods. At a low-level, each pod simulates agent-environment interactions in parallel by fully utilizing nearly 7,000 GPU CUDA cores in a single GPU. Our ElegantRL-podracer library features high scalability, elasticity and accessibility by following the development principles of containerization, microservices and MLOps. Using an NVIDIA DGX SuperPOD cloud, we conduct extensive experiments on various tasks in locomotion and stock trading and show that ElegantRL-podracer substantially outperforms RLlib. Our codes are available on GitHub.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2112.05923

Country:

Asia (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology (0.70)
Banking & Finance > Trading (0.68)
Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A model-based framework for learning transparent swarm behaviors

Coppola, Mario, Guo, Jian, Gill, Eberhard, de Croon, Guido C. H. E.

arXiv.org Artificial IntelligenceMar-9-2021

This paper proposes a model-based framework to automatically and efficiently design understandable and verifiable behaviors for swarms of robots. The framework is based on the automatic extraction of two distinct models: 1) a neural network model trained to estimate the relationship between the robots' sensor readings and the global performance of the swarm, and 2) a probabilistic state transition model that explicitly models the local state transitions (i.e., transitions in observations from the perspective of a single robot in the swarm) given a policy. The models can be trained from a data set of simulated runs featuring random policies. The first model is used to automatically extract a set of local states that are expected to maximize the global performance. These local states are referred to as desired local states. The second model is used to optimize a stochastic policy so as to increase the probability that the robots in the swarm observe one of the desired local states. Following these steps, the framework proposed in this paper can efficiently lead to effective controllers. This is tested on four case studies, featuring aggregation and foraging tasks. Importantly, thanks to the models, the framework allows us to understand and inspect a swarm's behavior. To this end, we propose verification checks to identify some potential issues that may prevent the swarm from achieving the desired global objective. In addition, we explore how the framework can be used in combination with a "standard" evolutionary robotics strategy (i.e., where performance is measured via simulation), or with online learning.

deep learning, neural network, swarm, (19 more...)

arXiv.org Artificial Intelligence

2103.05343

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.81)

Industry: Education > Educational Setting (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Novel Policy Seeking with Constrained Optimization

Sun, Hao, Peng, Zhenghao, Dai, Bo, Guo, Jian, Lin, Dahua, Zhou, Bolei

arXiv.org Artificial IntelligenceMay-21-2020

In this work, we address the problem of learning to seek novel policies in reinforcement learning tasks. Instead of following the multi-objective framework used in previous methods, we propose to rethink the problem under a novel perspective of constrained optimization. We first introduce a new metric to evaluate the difference between policies, and then design two practical novel policy seeking methods following the new perspective, namely the Constrained Task Novel Bisector (CTNB), and the Interior Policy Differentiation (IPD), corresponding to the feasible direction method and the interior point method commonly known in constrained optimization problems. Experimental comparisons on the MuJuCo control suite show our methods achieve substantial improvements over previous novelty-seeking methods in terms of both novelty and primal task performance.

artificial intelligence, novel policy, optimization problem, (18 more...)

arXiv.org Artificial Intelligence

2005.10696

Country:

Oceania > Australia (0.14)
Asia (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

Guo, Jian, He, He, He, Tong, Lausen, Leonard, Li, Mu, Lin, Haibin, Shi, Xingjian, Wang, Chenguang, Xie, Junyuan, Zha, Sheng, Zhang, Aston, Zhang, Hang, Zhang, Zhi, Zhang, Zhongyue, Zheng, Shuai

arXiv.org Machine LearningJul-9-2019

We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating). These toolkits provide state-of-the-art pre-trained models, training scripts, and training logs, to facilitate rapid prototyping and promote reproducible research. We also provide modular APIs with flexible building blocks to enable efficient customization. Leveraging the MXNet ecosystem, the deep learning models in GluonCV and GluonNLP can be deployed onto a variety of platforms with different programming languages. Benefiting from open source under the Apache 2.0 license, GluonCV and GluonNLP have attracted 100 contributors worldwide on GitHub. Models of GluonCV and GluonNLP have been downloaded for more than 1.6 million times in fewer than 10 months.

deep learning, gluoncv nlp, neural network, (15 more...)

arXiv.org Machine Learning

1907.04433

Country:

Asia > China (0.16)
North America > United States (0.15)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback