AITopics | Xu, Botian

Collaborating Authors

Xu, Botian

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ArtFormer: Controllable Generation of Diverse 3D Articulated Objects

Su, Jiayi, Feng, Youhe, Li, Zheng, Song, Jinhua, He, Yangfan, Ren, Botao, Xu, Botian

arXiv.org Artificial IntelligenceDec-10-2024

This paper presents a novel framework for modeling and conditional generation of 3D articulated objects. Troubled by flexibility-quality tradeoffs, existing methods are often limited to using predefined structures or retrieving shapes from static datasets. To address these challenges, we parameterize an articulated object as a tree of tokens and employ a transformer to generate both the object's high-level geometry code and its kinematic relations. Subsequently, each sub-part's geometry is further decoded using a signed-distance-function (SDF) shape prior, facilitating the synthesis of high-quality 3D shapes. Our approach enables the generation of diverse objects with high-quality geometry and varying number of parts. Comprehensive experiments on conditional generation from text descriptions demonstrate the effectiveness and flexibility of our method.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.07237

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Multi-UAV Behavior-based Formation with Static and Dynamic Obstacles Avoidance via Reinforcement Learning

Xie, Yuqing, Yu, Chao, Zang, Hongzhi, Gao, Feng, Tang, Wenhao, Huang, Jingyi, Chen, Jiayu, Xu, Botian, Wu, Yi, Wang, Yu

arXiv.org Artificial IntelligenceOct-24-2024

Formation control of multiple Unmanned Aerial Vehicles (UAVs) is vital for practical applications. This paper tackles the task of behavior-based UAV formation while avoiding static and dynamic obstacles during directed flight. We present a two-stage reinforcement learning (RL) training pipeline to tackle the challenge of multi-objective optimization, large exploration spaces, and the sim-to-real gap. The first stage searches in a simplified scenario for a linear utility function that balances all task objectives simultaneously, whereas the second stage applies the utility function in complex scenarios, utilizing curriculum learning to navigate large exploration spaces. Additionally, we apply an attention-based observation encoder to enhance formation maintenance and manage varying obstacle quantity. Experiments in simulation and real world demonstrate that our method outperforms planning-based and RL-based baselines regarding collision-free rate and formation maintenance in scenarios with static, dynamic, and mixed obstacles.

machine learning, obstacle, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2410.18495

Country: Asia > China (0.29)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

On the Evaluation of Generative Robotic Simulations

Chen, Feng, Xu, Botian, Hua, Pu, Duan, Peiqi, Yang, Yanchao, Ma, Yi, Xu, Huazhe

arXiv.org Artificial IntelligenceOct-10-2024

Due to the difficulty of acquiring extensive real-world data, robot simulation has become crucial for parallel training and sim-to-real transfer, highlighting the importance of scalable simulated robotic tasks. Foundation models have demonstrated impressive capacities in autonomously generating feasible robotic tasks. However, this new paradigm underscores the challenge of adequately evaluating these autonomously generated tasks. To address this, we propose a comprehensive evaluation framework tailored to generative simulations. For single-task quality, we evaluate the realism of the generated task and the completeness of the generated trajectories using large language models and vision-language models. In terms of diversity, we measure both task and data diversity through text similarity of task descriptions and world model loss trained on collected task trajectories. For task-level generalization, we assess the zero-shot generalization ability on unseen tasks of a policy trained with multiple generated tasks. Experiments conducted on three representative task generation pipelines demonstrate that the results from our framework are highly consistent with human evaluations, confirming the feasibility and validity of our approach. The findings reveal that while metrics of quality and diversity can be achieved through certain methods, no single approach excels across all metrics, suggesting a need for greater focus on balancing these different metrics. Additionally, our analysis further highlights the common challenge of low generalization capability faced by current works. Embodied artificial intelligence (EAI) is crucial to enable intelligent agents to understand and interact with the physical world.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.08172

Country:

Asia > China (0.14)
Asia > Middle East (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Multi-UAV Pursuit-Evasion with Online Planning in Unknown Environments by Deep Reinforcement Learning

Chen, Jiayu, Yu, Chao, Li, Guosheng, Tang, Wenhao, Yang, Xinyi, Xu, Botian, Yang, Huazhong, Wang, Yu

arXiv.org Artificial IntelligenceSep-25-2024

Multi-UAV pursuit-evasion, where pursuers aim to capture evaders, poses a key challenge for UAV swarm intelligence. Multi-agent reinforcement learning (MARL) has demonstrated potential in modeling cooperative behaviors, but most RL-based approaches remain constrained to simplified simulations with limited dynamics or fixed scenarios. Previous attempts to deploy RL policy to real-world pursuit-evasion are largely restricted to two-dimensional scenarios, such as ground vehicles or UAVs at fixed altitudes. In this paper, we address multi-UAV pursuit-evasion by considering UAV dynamics and physical constraints. We introduce an evader prediction-enhanced network to tackle partial observability in cooperative strategy learning. Additionally, we propose an adaptive environment generator within MARL training, enabling higher exploration efficiency and better policy generalization across diverse scenarios. Simulations show our method significantly outperforms all baselines in challenging scenarios, generalizing to unseen scenarios with a 100% capture rate. Finally, we derive a feasible policy via a two-stage reward refinement and deploy the policy on real quadrotors in a zero-shot manner. To our knowledge, this is the first work to derive and deploy an RL-based policy using collective thrust and body rates control commands for multi-UAV pursuit-evasion in unknown environments. The open-source code and videos are available at https://sites.google.com/view/pursuit-evasion-rl.

evader, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2409.15866

Country:

Asia > China (0.14)
South America > Brazil (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Improving Detection in Aerial Images by Capturing Inter-Object Relationships

Ren, Botao, Xu, Botian, Pu, Yifan, Wang, Jingyi, Deng, Zhidong

arXiv.org Artificial IntelligenceApr-5-2024

In many image domains, the spatial distribution of objects in a scene exhibits meaningful patterns governed by their semantic relationships. In most modern detection pipelines, however, the detection proposals are processed independently, overlooking the underlying relationships between objects. In this work, we introduce a transformer-based approach to capture these inter-object relationships to refine classification and regression outcomes for detected objects. Building on two-stage detectors, we tokenize the region of interest (RoI) proposals to be processed by a transformer encoder. Specific spatial and geometric relations are incorporated into the attention weights and adaptively modulated and regularized. Experimental results demonstrate that the proposed method achieves consistent performance improvement on three benchmarks including DOTA-v1.0, DOTA-v1.5, and HRSC 2016, especially ranking first on both DOTA-v1.5 and HRSC 2016. Specifically, our new method has an increase of 1.59 mAP on DOTA-v1.0, 4.88 mAP on DOTA-v1.5, and 2.1 mAP on HRSC 2016, respectively, compared to the baselines.

artificial intelligence, detection, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2404.0414

Country:

Asia > China (0.14)
Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

TaskFlex Solver for Multi-Agent Pursuit via Automatic Curriculum Learning

Chen, Jiayu, Li, Guosheng, Yu, Chao, Yang, Xinyi, Xu, Botian, Yang, Huazhong, Wang, Yu

arXiv.org Artificial IntelligenceDec-19-2023

This paper addresses the problem of multi-agent pursuit, where slow pursuers cooperate to capture fast evaders in a confined environment with obstacles. Existing heuristic algorithms often lack expressive coordination strategies and are highly sensitive to task conditions, requiring extensive hyperparameter tuning. In contrast, reinforcement learning (RL) has been applied to this problem and is capable of obtaining cooperative pursuit strategies. However, RL-based methods face challenges in training for complex scenarios due to the vast amount of training data and limited adaptability to varying task conditions, such as different scene sizes, varying numbers and speeds of obstacles, and flexible speed ratios of the evader to the pursuer. In this work, we combine RL and curriculum learning to introduce a flexible solver for multiagent pursuit problems, named TaskFlex Solver (TFS), which is capable of solving multi-agent pursuit problems with diverse and dynamically changing task conditions in both 2-dimensional and 3-dimensional scenarios. TFS utilizes a curriculum learning method that constructs task distributions based on training progress, enhancing training efficiency and final performance. Our algorithm consists of two main components: the Task Evaluator, which evaluates task success rates and selects tasks of moderate difficulty to maintain a curriculum archive, and the Task Sampler, which constructs training distributions by sampling tasks from the curriculum archive to maximize policy improvement. Experiments show that TFS produces much stronger performance than baselines and achieves close to 100% capture rates in both 2-dimensional and 3-dimensional multi-agent pursuit problems with diverse and dynamically changing scenes. The project website is at https://sites.google.com/view/tfs-2023.

artificial intelligence, machine learning, pursuer, (15 more...)

arXiv.org Artificial Intelligence

2312.12255

Country:

Asia > China (0.14)
South America > Brazil (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Feedback RoI Features Improve Aerial Object Detection

Ren, Botao, Xu, Botian, Liu, Tengyu, Wang, Jingyi, Deng, Zhidong

arXiv.org Artificial IntelligenceNov-28-2023

Neuroscience studies have shown that the human visual system utilizes high-level feedback information to guide lower-level perception, enabling adaptation to signals of different characteristics. In light of this, we propose Feedback multi-Level feature Extractor (Flex) to incorporate a similar mechanism for object detection. Flex refines feature selection based on image-wise and instance-level feedback information in response to image quality variation and classification uncertainty. Experimental results show that Flex offers consistent improvement to a range of existing SOTA methods on the challenging aerial object detection datasets including DOTA-v1.0, DOTA-v1.5, and HRSC2016. Although the design originates in aerial image detection, further experiments on MS COCO also reveal our module's efficacy in general detection models. Quantitative and qualitative analyses indicate that the improvements are closely related to image qualities, which match our motivation.

artificial intelligence, detection, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2311.17129

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

OmniDrones: An Efficient and Flexible Platform for Reinforcement Learning in Drone Control

Xu, Botian, Gao, Feng, Yu, Chao, Zhang, Ruize, Wu, Yi, Wang, Yu

arXiv.org Artificial IntelligenceSep-22-2023

In this work, we introduce OmniDrones, an efficient and flexible platform tailored for reinforcement learning in drone control, built on Nvidia's Omniverse Isaac Sim. It employs a bottom-up design approach that allows users to easily design and experiment with various application scenarios on top of GPU-parallelized simulations. It also offers a range of benchmark tasks, presenting challenges ranging from single-drone hovering to over-actuated system tracking. In summary, we propose an open-sourced drone simulation platform, equipped with an extensive suite of tools for drone learning. It includes 4 drone models, 5 sensor modalities, 4 control modes, over 10 benchmark tasks, and a selection of widely used RL baselines. To showcase the capabilities of OmniDrones and to support future research, we also provide preliminary results on these benchmark tasks. We hope this platform will encourage further studies on applying RL to practical drone systems.

artificial intelligence, efficient and flexible platform, reinforcement learning, (2 more...)

arXiv.org Artificial Intelligence

2309.12825

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.60)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.53)

Add feedback

Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased

Yu, Chao, Gao, Jiaxuan, Liu, Weilin, Xu, Botian, Tang, Hao, Yang, Jiaqi, Wang, Yu, Wu, Yi

arXiv.org Artificial IntelligenceFeb-3-2023

There is a recent trend of applying multi-agent reinforcement learning (MARL) to train an agent that can cooperate with humans in a zero-shot fashion without using any human data. The typical workflow is to first repeatedly run self-play (SP) to build a policy pool and then train the final adaptive policy against this pool. A crucial limitation of this framework is that every policy in the pool is optimized w.r.t. the environment reward function, which implicitly assumes that the testing partners of the adaptive policy will be precisely optimizing the same reward function as well. However, human objectives are often substantially biased according to their own preferences, which can differ greatly from the environment reward. We propose a more general framework, Hidden-Utility Self-Play (HSP), which explicitly models human biases as hidden reward functions in the self-play objective. By approximating the reward space as linear functions, HSP adopts an effective technique to generate an augmented policy pool with biased policies. We evaluate HSP on the Overcooked benchmark. Empirical results show that our HSP method produces higher rewards than baselines when cooperating with learned human models, manually scripted policies, and real humans. The HSP policy is also rated as the most assistive policy based on human feedback.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2302.01605

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.87)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback