AITopics | Peng, Yijie

Collaborating Authors

Peng, Yijie

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sample-Efficient Clustering and Conquer Procedures for Parallel Large-Scale Ranking and Selection

Zhang, Zishi, Peng, Yijie

arXiv.org Artificial IntelligenceFeb-3-2024

We propose novel "clustering and conquer" procedures for the parallel large-scale ranking and selection (R&S) problem, which leverage correlation information for clustering to break the bottleneck of sample efficiency. In parallel computing environments, correlation-based clustering can achieve an $\mathcal{O}(p)$ sample complexity reduction rate, which is the optimal reduction rate theoretically attainable. Our proposed framework is versatile, allowing for seamless integration of various prevalent R&S methods under both fixed-budget and fixed-precision paradigms. It can achieve improvements without the necessity of highly accurate correlation estimation and precise clustering. In large-scale AI applications such as neural architecture search, a screening-free version of our procedure surprisingly surpasses fully-sequential benchmarks in terms of sample efficiency. This suggests that leveraging valuable structural information, such as correlation, is a viable path to bypassing the traditional need for screening via pairwise comparison--a step previously deemed essential for high sample efficiency but problematic for parallelization. Additionally, we propose a parallel few-shot clustering algorithm tailored for large-scale problems.

artificial intelligence, machine learning, parallel correlation-based clustering, (13 more...)

arXiv.org Artificial Intelligence

2402.02196

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)

Add feedback

AlphaRank: An Artificial Intelligence Approach for Ranking and Selection Problems

Zhou, Ruihan, Hong, L. Jeff, Peng, Yijie

arXiv.org Artificial IntelligenceJan-31-2024

We introduce AlphaRank, an artificial intelligence approach to address the fixed-budget ranking and selection (R&S) problems. We formulate the sequential sampling decision as a Markov decision process and propose a Monte Carlo simulation-based rollout policy that utilizes classic R&S procedures as base policies for efficiently learning the value function of stochastic dynamic programming. We accelerate online sample-allocation by using deep reinforcement learning to pre-train a neural network model offline based on a given prior. We also propose a parallelizable computing framework for large-scale problems, effectively combining "divide and conquer" and "recursion" for enhanced scalability and efficiency. Numerical experiments demonstrate that the performance of AlphaRank is significantly improved over the base policies, which could be attributed to AlphaRank's superior capability on the trade-off among mean, variance, and induced correlation overlooked by many existing policies.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2402.00907

Country: Asia > China (0.46)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)
(2 more...)

Add feedback

One Forward is Enough for Neural Network Training via Likelihood Ratio Method

Jiang, Jinyang, Zhang, Zeliang, Xu, Chenliang, Yu, Zhaofei, Peng, Yijie

arXiv.org Artificial IntelligenceOct-13-2023

While backpropagation (BP) is the mainstream approach for gradient computation in neural network training, its heavy reliance on the chain rule of differentiation constrains the designing flexibility of network architecture and training pipelines. We avoid the recursive computation in BP and develop a unified likelihood ratio (ULR) method for gradient estimation with just one forward propagation. Not only can ULR be extended to train a wide variety of neural network architectures, but the computation flow in BP can also be rearranged by ULR for better device adaptation. Moreover, we propose several variance reduction techniques to further accelerate the training process. Our experiments offer numerical results across diverse aspects, including various neural network training scenarios, computation flow rearrangement, and fine-tuning of pre-trained models. All findings demonstrate that ULR effectively enhances the flexibility of neural network training by permitting localized module training without compromising the global objective and significantly boosts the network robustness. Since backpropagation (BP) (Rumelhart et al., 1986) has greatly facilitated the success of artificial intelligence (AI) in various real-world scenarios (Song et al., 2021a;b; Sung et al., 2021), researchers are motivated to connect this gradient computation method in neural network training with human learning behavior (Scellier & Bengio, 2017; Lillicrap et al., 2020). However, there is no evidence that the learning mechanism in biological neurons relies on BP (Hinton, 2022). Pursuing alternatives to BP holds promise for not only advancing our understanding of learning mechanisms but also developing more robust and interpretable AI systems. Moreover, the significant computational cost associated with BP (Gomez et al., 2017; Zhu et al., 2022) also calls for innovations that simplify and expedite the training process without heavy consumption. There have been continuous efforts to substitute BP in neural network training.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.0896

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Oil & Gas (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Learning for Selecting Top-m Context-Dependent Designs

Zhang, Gongbo, Chen, Sihua, Huang, Kuihua, Peng, Yijie

arXiv.org Machine LearningJun-9-2023

We consider a simulation optimization problem for a context-dependent decision-making, which aims to determine the top-m designs for all contexts. Under a Bayesian framework, we formulate the optimal dynamic sampling decision as a stochastic dynamic programming problem, and develop a sequential sampling policy to efficiently learn the performance of each design under each context. The asymptotically optimal sampling ratios are derived to attain the optimal large deviations rate of the worst-case of probability of false selection. The proposed sampling policy is proved to be consistent and its asymptotic sampling ratios are asymptotically optimal. Numerical experiments demonstrate that the proposed method improves the efficiency for selection of top-m context-dependent designs.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Machine Learning

2305.04086

Country: Asia > China (0.28)

Genre: Research Report (0.63)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)

Add feedback

A Novel Noise Injection-based Training Scheme for Better Model Robustness

Zhang, Zeliang, Jiang, Jinyang, Chen, Minjie, Wang, Zhiyuan, Peng, Yijie, Yu, Zhaofei

arXiv.org Artificial IntelligenceMay-29-2023

Noise injection-based method has been shown to be able to improve the robustness of artificial neural networks in previous work. In this work, we propose a novel noise injection-based training scheme for better model robustness. Specifically, we first develop a likelihood ratio method to estimate the gradient with respect to both synaptic weights and noise levels for stochastic gradient descent training. Then, we design an approximation for the vanilla noise injection-based training method to reduce memory and improve computational efficiency. Next, we apply our proposed scheme to spiking neural networks and evaluate the performance of classification accuracy and robustness on MNIST and Fashion-MNIST datasets. Experiment results show that our proposed method achieves a much better performance on adversarial robustness and slightly better performance on original accuracy, compared with the conventional gradient-based training method.

adversarial attack, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2302.10802

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Information Technology (0.96)
Energy > Oil & Gas (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Quantile-Based Deep Reinforcement Learning using Two-Timescale Policy Gradient Algorithms

Jiang, Jinyang, Hu, Jiaqiao, Peng, Yijie

arXiv.org Artificial IntelligenceMay-12-2023

Classical reinforcement learning (RL) aims to optimize the expected cumulative reward. In this work, we consider the RL setting where the goal is to optimize the quantile of the cumulative reward. We parameterize the policy controlling actions by neural networks, and propose a novel policy gradient algorithm called Quantile-Based Policy Optimization (QPO) and its variant Quantile-Based Proximal Policy Optimization (QPPO) for solving deep RL problems with quantile objectives. QPO uses two coupled iterations running at different timescales for simultaneously updating quantiles and policy parameters, whereas QPPO is an off-policy version of QPO that allows multiple updates of parameters during one simulation episode, leading to improved algorithm efficiency. Our numerical results indicate that the proposed algorithms outperform the existing baseline algorithms under the quantile criterion.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2305.07248

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry:

Banking & Finance > Trading (0.46)
Leisure & Entertainment > Games (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

An Efficient Dynamic Sampling Policy For Monte Carlo Tree Search

Zhang, Gongbo, Peng, Yijie, Xu, Yilong

arXiv.org Artificial IntelligenceApr-25-2022

Monte Carlo Tree Search (MCTS) is a popular tree-based search strategy within the framework of reinforcement learning (RL), which estimates the optimal value of a state and action by building a tree with Monte Carlo simulation. It has been widely used in sequential decision makings, including scheduling problems, inventory, production management, and real-world games, such as Go, Chess, Tic-tac-toe and Chinese Checkers. See Browne et al. (2012), Fu (2018) and Świechowski et al. (2021) for thorough overviews. MCTS uses little or no domain knowledge and self learns by running more simulations. Many variations have been proposed for MCTS to improve its performance. In particular, deep neural networks are combined into MCTS to achieve a remarkable success in the game of Go (Silver et al. 2016, 2017). A basic MCTS is to build a game tree from the root node in an incremental and asymmetric manner, where nodes correspond to states and edges correspond to possible state-action pairs.

artificial intelligence, machine learning, tree policy, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/WSC57314.2022.10015374

2204.12043

Country: Europe (0.93)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games > Tic-Tac-Toe (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Quantile-Based Policy Optimization for Reinforcement Learning

Jiang, Jinyang, Hu, Jiaqiao, Peng, Yijie

arXiv.org Artificial IntelligenceFeb-16-2022

Classical reinforcement learning (RL) aims to optimize the expected cumulative rewards. In this work, we consider the RL setting where the goal is to optimize the quantile of the cumulative rewards. We parameterize the policy controlling actions by neural networks and propose a novel policy gradient algorithm called Quantile-Based Policy Optimization (QPO) and its variant Quantile-Based Proximal Policy Optimization (QPPO) to solve deep RL problems with quantile objectives. QPO uses two coupled iterations running at different time scales for simultaneously estimating quantiles and policy parameters and is shown to converge to the global optimal policy under certain conditions. Our numerical results demonstrate that the proposed algorithms outperform the existing baseline algorithms under the quantile criterion.

artificial intelligence, machine learning, quantile-based policy optimization, (1 more...)

arXiv.org Artificial Intelligence

2201.11463

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)

Add feedback

Training Artificial Neural Networks by Generalized Likelihood Ratio Method: Exploring Brain-like Learning to Improve Adversarial Defensiveness

Xiao, Li, Peng, Yijie, Hong, Jeff, Ke, Zewu

arXiv.org Machine LearningJan-31-2019

Recent work in deep learning has shown that the artificial neural networks are vulnerable to adversarial attacks, where a very small perturbation of the inputs can drastically alter the classification result. In this work, we propose a generalized likelihood ratio method capable of training the artificial neural networks with some biological brain-like mechanisms,.e.g., (a) learning by the loss value, (b) learning via neurons with discontinuous activation and loss functions. The traditional back propagation method cannot train the artificial neural networks with aforementioned brain-like learning mechanisms. Numerical results show that various artificial neural networks trained by the new method can significantly improve the defensiveness against the adversarial attacks. Code is available: \url{https://github.com/LX-doctorAI/GLR_ADV} .

deep learning, loss function, neural network, (20 more...)

arXiv.org Machine Learning

1902.00358

Country:

North America > United States (0.46)
Asia > China (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback