AITopics | Chen, Jiayu

Collaborating Authors

Chen, Jiayu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Threshold Neuron: A Brain-inspired Artificial Neuron for Efficient On-device Inference

Zheng, Zihao, Li, Yuanchun, Chen, Jiayu, Zhou, Peng, Chen, Xiang, Liu, Yunxin

arXiv.org Artificial IntelligenceJan-10-2025

Enhancing the computational efficiency of on-device Deep Neural Networks (DNNs) remains a significant challengein mobile and edge computing. As we aim to execute increasingly complex tasks with constrained computational resources, much of the research has focused on compressing neural network structures and optimizing systems. Although many studies have focused on compressing neural network structures and parameters or optimizing underlying systems, there has been limited attention on optimizing the fundamental building blocks of neural networks: the neurons. In this study, we deliberate on a simple but important research question: Can we design artificial neurons that offer greater efficiency than the traditional neuron paradigm? Inspired by the threshold mechanisms and the excitation-inhibition balance observed in biological neurons, we propose a novel artificial neuron model, Threshold Neurons. Using Threshold Neurons, we can construct neural networks similar to those with traditional artificial neurons, while significantly reducing hardware implementation complexity. Our extensive experiments validate the effectiveness of neural networks utilizing Threshold Neurons, achieving substantial power savings of 7.51x to 8.19x and area savings of 3.89x to 4.33x at the kernel level, with minimal loss in precision. Furthermore, FPGA-based implementations of these networks demonstrate 2.52x power savings and 1.75x speed enhancements at the system level. The source code will be made available upon publication.

artificial intelligence, machine learning, threshold neuron, (17 more...)

arXiv.org Artificial Intelligence

2412.13902

Country:

Europe (0.46)
North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

What Matters in Learning A Zero-Shot Sim-to-Real RL Policy for Quadrotor Control? A Comprehensive Study

Chen, Jiayu, Yu, Chao, Xie, Yuqing, Gao, Feng, Chen, Yinuo, Yu, Shu'ang, Tang, Wenhao, Ji, Shilong, Mu, Mo, Wu, Yi, Yang, Huazhong, Wang, Yu

arXiv.org Artificial IntelligenceDec-22-2024

Executing precise and agile flight maneuvers is critical for quadrotors in various applications. Traditional quadrotor control approaches are limited by their reliance on flat trajectories or time-consuming optimization, which restricts their flexibility. Recently, RL-based policy has emerged as a promising alternative due to its ability to directly map observations to actions, reducing the need for detailed system knowledge and actuation constraints. However, a significant challenge remains in bridging the sim-to-real gap, where RL-based policies often experience instability when deployed in real world. In this paper, we investigate key factors for learning robust RL-based control policies that are capable of zero-shot deployment in real-world quadrotors. We identify five critical factors and we develop a PPO-based training framework named SimpleFlight, which integrates these five techniques. We validate the efficacy of SimpleFlight on Crazyflie quadrotor, demonstrating that it achieves more than a 50% reduction in trajectory tracking error compared to state-of-the-art RL baselines. The policy derived by SimpleFlight consistently excels across both smooth polynominal trajectories and challenging infeasible zigzag trajectories on small thrust-to-weight quadrotors. In contrast, baseline methods struggle with high-speed or infeasible trajectories. To support further research and reproducibility, we integrate SimpleFlight into a GPU-based simulator Omnidrones and provide open-source access to the code and model checkpoints. We hope SimpleFlight will offer valuable insights for advancing RL-based quadrotor control. For more details, visit our project website at https://sites.google.com/view/simpleflight/.

artificial intelligence, machine learning, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2412.11764

Country: Asia > China (0.47)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (0.35)
Transportation > Air (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.62)

Add feedback

Multi-UAV Behavior-based Formation with Static and Dynamic Obstacles Avoidance via Reinforcement Learning

Xie, Yuqing, Yu, Chao, Zang, Hongzhi, Gao, Feng, Tang, Wenhao, Huang, Jingyi, Chen, Jiayu, Xu, Botian, Wu, Yi, Wang, Yu

arXiv.org Artificial IntelligenceOct-24-2024

Formation control of multiple Unmanned Aerial Vehicles (UAVs) is vital for practical applications. This paper tackles the task of behavior-based UAV formation while avoiding static and dynamic obstacles during directed flight. We present a two-stage reinforcement learning (RL) training pipeline to tackle the challenge of multi-objective optimization, large exploration spaces, and the sim-to-real gap. The first stage searches in a simplified scenario for a linear utility function that balances all task objectives simultaneously, whereas the second stage applies the utility function in complex scenarios, utilizing curriculum learning to navigate large exploration spaces. Additionally, we apply an attention-based observation encoder to enhance formation maintenance and manage varying obstacle quantity. Experiments in simulation and real world demonstrate that our method outperforms planning-based and RL-based baselines regarding collision-free rate and formation maintenance in scenarios with static, dynamic, and mixed obstacles.

machine learning, obstacle, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2410.18495

Country: Asia > China (0.29)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning

Chen, Jiayu, Chen, Wentse, Schneider, Jeff

arXiv.org Artificial IntelligenceOct-14-2024

Offline reinforcement learning (RL) is a powerful approach for data-driven decision-making and control. Compared to model-free methods, offline model-based reinforcement learning (MBRL) explicitly learns world models from a static dataset and uses them as surrogate simulators, improving the data efficiency and enabling the learned policy to potentially generalize beyond the dataset support. However, there could be various MDPs that behave identically on the offline dataset and so dealing with the uncertainty about the true MDP can be challenging. In this paper, we propose modeling offline MBRL as a Bayes Adaptive Markov Decision Process (BAMDP), which is a principled framework for addressing model uncertainty. We further introduce a novel Bayes Adaptive Monte-Carlo planning algorithm capable of solving BAMDPs in continuous state and action spaces with stochastic transitions. This planning process is based on Monte Carlo Tree Search and can be integrated into offline MBRL as a policy improvement operator in policy iteration. Our ``RL + Search" framework follows in the footsteps of superhuman AIs like AlphaZero, improving on current offline MBRL methods by incorporating more computation input. The proposed algorithm significantly outperforms state-of-the-art model-based and model-free offline RL methods on twelve D4RL MuJoCo benchmark tasks and three target tracking tasks in a challenging, stochastic tokamak control simulator.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2410.11234

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.67)

Industry:

Energy > Oil & Gas > Upstream (0.93)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

Multi-UAV Pursuit-Evasion with Online Planning in Unknown Environments by Deep Reinforcement Learning

Chen, Jiayu, Yu, Chao, Li, Guosheng, Tang, Wenhao, Yang, Xinyi, Xu, Botian, Yang, Huazhong, Wang, Yu

arXiv.org Artificial IntelligenceSep-25-2024

Multi-UAV pursuit-evasion, where pursuers aim to capture evaders, poses a key challenge for UAV swarm intelligence. Multi-agent reinforcement learning (MARL) has demonstrated potential in modeling cooperative behaviors, but most RL-based approaches remain constrained to simplified simulations with limited dynamics or fixed scenarios. Previous attempts to deploy RL policy to real-world pursuit-evasion are largely restricted to two-dimensional scenarios, such as ground vehicles or UAVs at fixed altitudes. In this paper, we address multi-UAV pursuit-evasion by considering UAV dynamics and physical constraints. We introduce an evader prediction-enhanced network to tackle partial observability in cooperative strategy learning. Additionally, we propose an adaptive environment generator within MARL training, enabling higher exploration efficiency and better policy generalization across diverse scenarios. Simulations show our method significantly outperforms all baselines in challenging scenarios, generalizing to unseen scenarios with a 100% capture rate. Finally, we derive a feasible policy via a two-stage reward refinement and deploy the policy on real quadrotors in a zero-shot manner. To our knowledge, this is the first work to derive and deploy an RL-based policy using collective thrust and body rates control commands for multi-UAV pursuit-evasion in unknown environments. The open-source code and videos are available at https://sites.google.com/view/pursuit-evasion-rl.

evader, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2409.15866

Country:

Asia > China (0.14)
South America > Brazil (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions

Chen, Jiayu, Ganguly, Bhargav, Xu, Yang, Mei, Yongsheng, Lan, Tian, Aggarwal, Vaneet

arXiv.org Artificial IntelligenceMay-25-2024

Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline policy learning exhibits great potential, and numerous studies have explored in this direction. However, this field still lacks a comprehensive review and so developments of different branches are relatively independent. In this paper, we provide the first systematic review on the applications of deep generative models for offline policy learning. In particular, we cover five mainstream deep generative models, including Variational Auto-Encoders, Generative Adversarial Networks, Normalizing Flows, Transformers, and Diffusion Models, and their applications in both offline reinforcement learning (offline RL) and imitation learning (IL). Offline RL and IL are two main branches of offline policy learning and are widely-adopted techniques for sequential decision-making. Notably, for each type of DGM-based offline policy learning, we distill its fundamental scheme, categorize related works based on the usage of the DGM, and sort out the development process of algorithms in that field. Subsequent to the main content, we provide in-depth discussions on deep generative models and offline policy learning as a summary, based on which we present our perspectives on future research directions. This work offers a hands-on reference for the research progress in deep generative models for offline policy learning, and aims to inspire improved DGM-based offline RL or IL algorithms. For convenience, we maintain a paper list on https://github.com/LucasCJYSDL/DGMs-for-Offline-Policy-Learning.

artificial intelligence, machine learning, neural information processing system track, (15 more...)

arXiv.org Artificial Intelligence

2402.13777

Country: North America > United States > Indiana > Tippecanoe County (0.14)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.92)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Education (1.00)
Transportation > Ground > Road (0.67)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Variational Offline Multi-agent Skill Discovery

Chen, Jiayu, Ganguly, Bhargav, Lan, Tian, Aggarwal, Vaneet

arXiv.org Artificial IntelligenceMay-25-2024

Skills are effective temporal abstractions established for sequential decision making tasks, which enable efficient hierarchical learning for long-horizon tasks and facilitate multi-task learning through their transferability. Despite extensive research, research gaps remain in multi-agent scenarios, particularly for automatically extracting subgroup coordination patterns in a multi-agent task. In this case, we propose two novel auto-encoder schemes: VO-MASD-3D and VO-MASD-Hier, to simultaneously capture subgroup- and temporal-level abstractions and form multi-agent skills, which firstly solves the aforementioned challenge. An essential algorithm component of these schemes is a dynamic grouping function that can automatically detect latent subgroups based on agent interactions in a task. Notably, our method can be applied to offline multi-task data, and the discovered subgroup skills can be transferred across relevant tasks without retraining. Empirical evaluations on StarCraft tasks indicate that our approach significantly outperforms existing methods regarding applying skills in multi-agent reinforcement learning (MARL). Moreover, skills discovered using our method can effectively reduce the learning difficulty in MARL scenarios with delayed and sparse reward signals.

artificial intelligence, machine learning, multi-agent skill, (16 more...)

arXiv.org Artificial Intelligence

2405.16386

Country: North America > United States > Indiana > Tippecanoe County (0.14)

Genre:

Research Report (1.00)
Overview (0.93)

Industry:

Education (0.46)
Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

DSAM: A Deep Learning Framework for Analyzing Temporal and Spatial Dynamics in Brain Networks

Thapaliya, Bishal, Miller, Robyn, Chen, Jiayu, Wang, Yu-Ping, Akbas, Esra, Sapkota, Ram, Ray, Bhaskar, Suresh, Pranav, Ghimire, Santosh, Calhoun, Vince, Liu, Jingyu

arXiv.org Artificial IntelligenceMay-19-2024

Resting-state functional magnetic resonance imaging (rs-fMRI) is a noninvasive technique pivotal for understanding human neural mechanisms of intricate cognitive processes. Most rs-fMRI studies compute a single static functional connectivity matrix across brain regions of interest, or dynamic functional connectivity matrices with a sliding window approach. These approaches are at risk of oversimplifying brain dynamics and lack proper consideration of the goal at hand. While deep learning has gained substantial popularity for modeling complex relational data, its application to uncovering the spatiotemporal dynamics of the brain is still limited. We propose a novel interpretable deep learning framework that learns goal-specific functional connectivity matrix directly from time series and employs a specialized graph neural network for the final classification. Our model, DSAM, leverages temporal causal convolutional networks to capture the temporal dynamics in both low- and high-level feature representations, a temporal attention unit to identify important time points, a self-attention unit to construct the goal-specific connectivity matrix, and a novel variant of graph neural network to capture the spatial dynamics for downstream classification. To validate our approach, we conducted experiments on the Human Connectome Project dataset with 1075 samples to build and interpret the model for the classification of sex group, and the Adolescent Brain Cognitive Development Dataset with 8520 samples for independent testing. Compared our proposed framework with other state-of-art models, results suggested this novel approach goes beyond the assumption of a fixed connectivity matrix and provides evidence of goal-specific brain connectivity patterns, which opens up the potential to gain deeper insights into how the human brain adapts its functional connectivity specific to the task at hand.

artificial intelligence, dynamic spatio-temporal attention model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2405.15805

Country: North America > United States (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries

Ganesh, Swetha, Chen, Jiayu, Thoppe, Gugan, Aggarwal, Vaneet

arXiv.org Artificial IntelligenceMar-14-2024

Federated Reinforcement Learning (FRL) allows multiple agents to collaboratively build a decision making policy without sharing raw trajectories. However, if a small fraction of these agents are adversarial, it can lead to catastrophic results. We propose a policy gradient based approach that is robust to adversarial agents which can send arbitrary values to the server. Under this setting, our results form the first global convergence guarantees with general parametrization.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2403.0994

Country: North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)

Add feedback

Reinforced Sequential Decision-Making for Sepsis Treatment: The POSNEGDM Framework with Mortality Classifier and Transformer

Tamboli, Dipesh, Chen, Jiayu, Jotheeswaran, Kiran Pranesh, Yu, Denny, Aggarwal, Vaneet

arXiv.org Artificial IntelligenceMar-12-2024

Sepsis, a life-threatening condition triggered by the body's exaggerated response to infection, demands urgent intervention to prevent severe complications. Existing machine learning methods for managing sepsis struggle in offline scenarios, exhibiting suboptimal performance with survival rates below 50%. This paper introduces the POSNEGDM -- ``Reinforcement Learning with Positive and Negative Demonstrations for Sequential Decision-Making" framework utilizing an innovative transformer-based model and a feedback reinforcer to replicate expert actions while considering individual patient characteristics. A mortality classifier with 96.7\% accuracy guides treatment decisions towards positive outcomes. The POSNEGDM framework significantly improves patient survival, saving 97.39% of patients, outperforming established machine learning algorithms (Decision Transformer and Behavioral Cloning) with survival rates of 33.4% and 43.5%, respectively. Additionally, ablation studies underscore the critical role of the transformer-based decision maker and the integration of a mortality classifier in enhancing overall survival rates. In summary, our proposed approach presents a promising avenue for enhancing sepsis treatment outcomes, contributing to improved patient care and reduced healthcare costs.

machine learning, reinforcement learning, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2403.07309

Country: North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback