AITopics | Chen, Jiaxin

Collaborating Authors

Chen, Jiaxin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

Yang, Kai, Tao, Jian, Lyu, Jiafei, Ge, Chunjiang, Chen, Jiaxin, Li, Qimai, Shen, Weihan, Zhu, Xiaolong, Li, Xiu

arXiv.org Artificial IntelligenceNov-23-2023

Using reinforcement learning with human feedback (RLHF) has shown significant promise in fine-tuning diffusion models. Previous methods start by training a reward model that aligns with human preferences, then leverage RL techniques to fine-tune the underlying models. However, crafting an efficient reward model demands extensive datasets, optimal architecture, and manual hyperparameter tuning, making the process both time and cost-intensive. The direct preference optimization (DPO) method, effective in fine-tuning large language models, eliminates the necessity for a reward model. However, the extensive GPU memory requirement of the diffusion model's denoising process hinders the direct application of the DPO method. To address this issue, we introduce the Direct Preference for Denoising Diffusion Policy Optimization (D3PO) method to directly fine-tune diffusion models. The theoretical analysis demonstrates that although D3PO omits training a reward model, it effectively functions as the optimal reward model trained using human feedback data to guide the learning process. This approach requires no training of a reward model, proving to be more direct, cost-effective, and minimizing computational overhead. In experiments, our method uses the relative scale of objectives as a proxy for human preference, delivering comparable results to methods using ground-truth rewards. Moreover, D3PO demonstrates the ability to reduce image distortion rates and generate safer images, overcoming challenges lacking robust reward models. Our code is publicly available in https://github.com/yk7333/D3PO/tree/main.

diffusion model, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2311.13231

Country: Asia (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neural MMO 2.0: A Massively Multi-task Addition to Massively Multi-agent Learning

Suárez, Joseph, Isola, Phillip, Choe, Kyoung Whan, Bloomin, David, Li, Hao Xiang, Pinnaparaju, Nikhil, Kanna, Nishaanth, Scott, Daniel, Sullivan, Ryan, Shuman, Rose S., de Alcântara, Lucas, Bradley, Herbie, Castricato, Louis, You, Kirsty, Jiang, Yuhao, Li, Qimai, Chen, Jiaxin, Zhu, Xiaolong

arXiv.org Artificial IntelligenceNov-7-2023

Neural MMO 2.0 is a massively multi-agent environment for reinforcement learning research. The key feature of this new version is a flexible task system that allows users to define a broad range of objectives and reward signals. We challenge researchers to train agents capable of generalizing to tasks, maps, and opponents never seen during training. Neural MMO features procedurally generated maps with 128 agents in the standard setting and support for up to. Version 2.0 is a complete rewrite of its predecessor with three-fold improved performance and compatibility with CleanRL. We release the platform as free and open-source software with comprehensive documentation available at neuralmmo.github.io and an active community Discord. To spark initial research on this new platform, we are concurrently running a competition at NeurIPS 2023.

artificial intelligence, machine learning, neural mmo 2, (16 more...)

arXiv.org Artificial Intelligence

2311.03736

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.65)

Industry: Leisure & Entertainment > Games > Computer Games (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The NeurIPS 2022 Neural MMO Challenge: A Massively Multiagent Competition with Specialization and Trade

Liu, Enhong, Suarez, Joseph, You, Chenhui, Wu, Bo, Chen, Bingcheng, Hu, Jun, Chen, Jiaxin, Zhu, Xiaolong, Zhu, Clare, Togelius, Julian, Mohanty, Sharada, Hong, Weijun, Du, Rui, Zhang, Yibing, Wang, Qinwen, Li, Xinhang, Yuan, Zheng, Li, Xiang, Huang, Yuejia, Zhang, Kun, Yang, Hanhui, Tang, Shiqi, Isola, Phillip

arXiv.org Artificial IntelligenceNov-6-2023

In this paper, we present the results of the NeurIPS-2022 Neural MMO Challenge, which attracted 500 participants and received over 1,600 submissions. Like the previous IJCAI-2022 Neural MMO Challenge, it involved agents from 16 populations surviving in procedurally generated worlds by collecting resources and defeating opponents. This year's competition runs on the latest v1.6 Neural MMO, which introduces new equipment, combat, trading, and a better scoring system. These elements combine to pose additional robustness and generalization challenges not present in previous competitions. This paper summarizes the design and results of the challenge, explores the potential of this environment as a benchmark for learning methods, and presents some practical reinforcement learning training approaches for complex tasks with sparse rewards. Additionally, we have open-sourced our baselines, including environment wrappers, benchmarks, and visualization tools for future research.

artificial intelligence, machine learning, participant, (14 more...)

arXiv.org Artificial Intelligence

2311.03707

Country:

Asia > China (0.28)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMO

Chen, Yangkun, Suarez, Joseph, Zhang, Junjie, Yu, Chenghui, Wu, Bo, Chen, Hanmo, Zhu, Hengman, Du, Rui, Qian, Shanliang, Liu, Shuai, Hong, Weijun, He, Jinke, Zhang, Yibing, Zhao, Liang, Zhu, Clare, Togelius, Julian, Mohanty, Sharada, Chen, Jiaxin, Li, Xiu, Zhu, Xiaolong, Isola, Phillip

arXiv.org Artificial IntelligenceAug-30-2023

We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions. This competition targets robustness and generalization in multi-agent systems: participants train teams of agents to complete a multi-task objective against opponents not seen during training. The competition combines relatively complex environment design with large numbers of agents in the environment. The top submissions demonstrate strong success on this task using mostly standard reinforcement learning (RL) methods combined with domain-specific engineering. We summarize the competition design and results and suggest that, as an academic community, competitions may be a powerful approach to solving hard problems and establishing a solid benchmark for algorithms. We will open-source our benchmark including the environment wrapper, baselines, a visualization tool, and selected policies for further research.

artificial intelligence, benchmarking robustness and generalization, multi-agent system, (2 more...)

arXiv.org Artificial Intelligence

2308.15802

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Recon: Reducing Conflicting Gradients from the Root for Multi-Task Learning

Shi, Guangyuan, Li, Qimai, Zhang, Wenlong, Chen, Jiaxin, Wu, Xiao-Ming

arXiv.org Artificial IntelligenceFeb-22-2023

A fundamental challenge for multi-task learning is that different tasks may conflict with each other when they are solved jointly, and a cause of this phenomenon is conflicting gradients during optimization. Recent works attempt to mitigate the influence of conflicting gradients by directly altering the gradients based on some criteria. However, our empirical study shows that "gradient surgery" cannot effectively reduce the occurrence of conflicting gradients. In this paper, we take a different approach to reduce conflicting gradients from the root. In essence, we investigate the task gradients w.r.t. each shared network layer, select the layers with high conflict scores, and turn them to task-specific layers. Our experiments show that such a simple approach can greatly reduce the occurrence of conflicting gradients in the remaining shared layers and achieve better performance, with only a slight increase in model parameters in many cases. Our approach can be easily applied to improve various state-of-the-art methods including gradient manipulation methods and branched architecture search methods. Given a network architecture (e.g., ResNet18), it only needs to search for the conflict layers once, and the network can be modified to be used with different methods on the same or even different datasets to gain performance improvement. Multi-task learning (MTL) is a learning paradigm in which multiple different but correlated tasks are jointly trained with a shared model (Caruana, 1997), in the hope of achieving better performance with an overall smaller model size than learning each task independently. By discovering shared structures across tasks and leveraging domain-specific training signals of related tasks, MTL can achieve efficiency and effectiveness.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2302.11289

Country: Asia > China (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Emergent collective intelligence from massive-agent cooperation and competition

Chen, Hanmo, Tao, Stone, Chen, Jiaxin, Shen, Weihan, Li, Xihui, Yu, Chenghui, Cheng, Sikai, Zhu, Xiaolong, Li, Xiu

arXiv.org Artificial IntelligenceJan-5-2023

Inspired by organisms evolving through cooperation and competition between different populations on Earth, we study the emergence of artificial collective intelligence through massive-agent reinforcement learning. To this end, We propose a new massive-agent reinforcement learning environment, Lux, where dynamic and massive agents in two teams scramble for limited resources and fight off the darkness. In Lux, we build our agents through the standard reinforcement learning algorithm in curriculum learning phases and leverage centralized control via a pixel-to-pixel policy network. As agents co-evolve through self-play, we observe several stages of intelligence, from the acquisition of atomic skills to the development of group strategies. Since these learned group strategies arise from individual decisions without an explicit coordination mechanism, we claim that artificial collective intelligence emerges from massive-agent cooperation and competition. We further analyze the emergence of various learned strategies through metrics and ablation studies, aiming to provide insights for reinforcement learning implementations in massive-agent environments.

citytile, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2301.01609

Country: North America > United States (0.46)

Genre: Research Report (0.82)

Industry:

Education (0.88)
Leisure & Entertainment > Games > Computer Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Multi-Agent Path Finding via Tree LSTM

Jiang, Yuhao, Zhang, Kunjie, Li, Qimai, Chen, Jiaxin, Zhu, Xiaolong

arXiv.org Artificial IntelligenceDec-13-2022

In recent years, Multi-Agent Path Finding (MAPF) has attracted attention from the fields of both Operations Research (OR) and Reinforcement Learning (RL). However, in the 2021 Flatland3 Challenge, a competition on MAPF, the best RL method scored only 27.9, far less than the best OR method. This paper proposes a new RL solution to Flatland3 Challenge, which scores 125.3, several times higher than the best RL solution before. We creatively apply a novel network architecture, TreeLSTM, to MAPF in our solution. Together with several other RL techniques, including reward shaping, multiple-phase training, and centralized control, our solution is comparable to the top 2-3 OR methods.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2210.12933

Country: Asia > China (0.28)

Genre: Research Report (0.64)

Industry: Transportation > Ground > Rail (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback