AITopics | Wu, Xiaofei

Collaborating Authors

Wu, Xiaofei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

FastGrasp: Efficient Grasp Synthesis with Diffusion

Wu, Xiaofei, Liu, Tao, Li, Caoji, Ma, Yuexin, Shi, Yujiao, He, Xuming

arXiv.org Artificial IntelligenceNov-22-2024

Effectively modeling the interaction between human hands and objects is challenging due to the complex physical constraints and the requirement for high generation efficiency in applications. Prior approaches often employ computationally intensive two-stage approaches, which first generate an intermediate representation, such as contact maps, followed by an iterative optimization procedure that updates hand meshes to capture the hand-object relation. However, due to the high computation complexity during the optimization stage, such strategies often suffer from low efficiency in inference. To address this limitation, this work introduces a novel diffusion-model-based approach that generates the grasping pose in a one-stage manner. This allows us to significantly improve generation speed and the diversity of generated hand poses. In particular, we develop a Latent Diffusion Model with an Adaptation Module for object-conditioned hand pose generation and a contact-aware loss to enforce the physical constraints between hands and objects. Extensive experiments demonstrate that our method achieves faster inference, higher diversity, and superior pose quality than state-of-the-art approaches. Code is available at \href{https://github.com/wuxiaofei01/FastGrasp}{https://github.com/wuxiaofei01/FastGrasp.}

artificial intelligence, diffusion model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.14786

Country: Europe (0.46)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

RealDex: Towards Human-like Grasping for Robotic Dexterous Hand

Liu, Yumeng, Yang, Yaxun, Wang, Youzhuo, Wu, Xiaofei, Wang, Jiamin, Yao, Yichen, Schwertfeger, Sören, Yang, Sibei, Wang, Wenping, Yu, Jingyi, He, Xuming, Ma, Yuexin

arXiv.org Artificial IntelligenceFeb-21-2024

In this paper, we introduce RealDex, a pioneering dataset capturing authentic dexterous hand grasping motions infused with human behavioral patterns, enriched by multi-view and multimodal visual data. Utilizing a teleoperation system, we seamlessly synchronize human-robot hand poses in real time. This collection of human-like motions is crucial for training dexterous hands to mimic human movements more naturally and precisely. RealDex holds immense promise in advancing humanoid robot for automated perception, cognition, and manipulation in real-world scenarios. Moreover, we introduce a cutting-edge dexterous grasping motion generation framework, which aligns with human experience and enhances real-world applicability through effectively utilizing Multimodal Large Language Models. Extensive experiments have demonstrated the superior performance of our method on RealDex and other open datasets. The complete dataset and code will be made available upon the publication of this work.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2402.13853

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.90)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback

YAYI 2: Multilingual Open-Source Large Language Models

Luo, Yin, Kong, Qingchao, Xu, Nan, Cao, Jia, Hao, Bao, Qu, Baoyu, Chen, Bo, Zhu, Chao, Zhao, Chenyang, Zhang, Donglei, Feng, Fan, Zhao, Feifei, Sun, Hailong, Yang, Hanxuan, Pan, Haojun, Liu, Hongyu, Guo, Jianbin, Du, Jiangtao, Wang, Jingyi, Li, Junfeng, Sun, Lei, Liu, Liduo, Dong, Lifeng, Liu, Lili, Wang, Lin, Zhang, Liwen, Wang, Minzheng, Wang, Pin, Yu, Ping, Li, Qingxiao, Yan, Rui, Zou, Rui, Li, Ruiqun, Huang, Taiwen, Wang, Xiaodong, Wu, Xiaofei, Peng, Xin, Zhang, Xina, Fang, Xing, Xiao, Xinglin, Hao, Yanni, Dong, Yao, Wang, Yigang, Liu, Ying, Jiang, Yongyu, Wang, Yungan, Wang, Yuqi, Wang, Zhangsheng, Yu, Zhaoxin, Luo, Zhen, Mao, Wenji, Wang, Lei, Zeng, Dajun

arXiv.org Artificial IntelligenceDec-22-2023

As the latest advancements in natural language processing, large language models (LLMs) have achieved human-level language understanding and generation abilities in many real-world tasks, and even have been regarded as a potential path to the artificial general intelligence. To better facilitate research on LLMs, many open-source LLMs, such as Llama 2 and Falcon, have recently been proposed and gained comparable performances to proprietary models. However, these models are primarily designed for English scenarios and exhibit poor performances in Chinese contexts. In this technical report, we propose YAYI 2, including both base and chat models, with 30 billion parameters. YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline. The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback. Extensive experiments on multiple benchmarks, such as MMLU and CMMLU, consistently demonstrate that the proposed YAYI 2 outperforms other similar sized open-source models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2312.14862

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.93)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A unified consensus-based parallel ADMM algorithm for high-dimensional regression with combined regularizations

Wu, Xiaofei, Zhang, Zhimin, Cui, Zhenyu

arXiv.org Machine LearningNov-20-2023

The parallel alternating direction method of multipliers (ADMM) algorithm is widely recognized for its effectiveness in handling large-scale datasets stored in a distributed manner, making it a popular choice for solving statistical learning models. However, there is currently limited research on parallel algorithms specifically designed for high-dimensional regression with combined (composite) regularization terms. These terms, such as elastic-net, sparse group lasso, sparse fused lasso, and their nonconvex variants, have gained significant attention in various fields due to their ability to incorporate prior information and promote sparsity within specific groups or fused variables. The scarcity of parallel algorithms for combined regularizations can be attributed to the inherent nonsmoothness and complexity of these terms, as well as the absence of closed-form solutions for certain proximal operators associated with them. In this paper, we propose a unified constrained optimization formulation based on the consensus problem for these types of convex and nonconvex regression problems and derive the corresponding parallel ADMM algorithms. Furthermore, we prove that the proposed algorithm not only has global convergence but also exhibits linear convergence rate. Extensive simulation experiments, along with a financial example, serve to demonstrate the reliability, stability, and scalability of our algorithm. The R package for implementing the proposed algorithms can be obtained at https://github.com/xfwu1016/CPADMM.

artificial intelligence, machine learning, survey article, (21 more...)

arXiv.org Machine Learning

2311.12319

Country:

Asia > China (0.46)
North America > United States (0.45)

Genre: Research Report > New Finding (0.67)

Industry:

Banking & Finance > Trading (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons

Yang, Sicheng, Wang, Zilin, Wu, Zhiyong, Li, Minglei, Zhang, Zhensong, Huang, Qiaochu, Hao, Lei, Xu, Songcen, Wu, Xiaofei, yang, changpeng, Dai, Zonghong

arXiv.org Artificial IntelligenceSep-13-2023

The automatic co-speech gesture generation draws much attention in computer animation. Previous works designed network structures on individual datasets, which resulted in a lack of data volume and generalizability across different motion capture standards. In addition, it is a challenging task due to the weak correlation between speech and gestures. To address these problems, we present UnifiedGesture, a novel diffusion model-based speech-driven gesture synthesis approach, trained on multiple gesture datasets with different skeletons. Specifically, we first present a retargeting network to learn latent homeomorphic graphs for different motion capture standards, unifying the representations of various gestures while extending the dataset. We then capture the correlation between speech and gestures based on a diffusion model architecture using cross-local attention and self-attention to generate better speech-matched and realistic gestures. To further align speech and gesture and increase diversity, we incorporate reinforcement learning on the discrete gesture units with a learned reward function. Extensive experiments show that UnifiedGesture outperforms recent approaches on speech-driven gesture generation in terms of CCA, FGD, and human-likeness. All code, pre-trained models, databases, and demos are available to the public at https://github.com/YoungSeng/UnifiedGesture.

animation, artificial intelligence, machine learning, (3 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3581783.3612503

2309.07051

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.73)
Information Technology > Graphics > Animation (0.53)

Add feedback

The DiffuseStyleGesture+ entry to the GENEA Challenge 2023

Yang, Sicheng, Xue, Haiwei, Zhang, Zhensong, Li, Minglei, Wu, Zhiyong, Wu, Xiaofei, Xu, Songcen, Dai, Zonghong

arXiv.org Artificial IntelligenceAug-26-2023

In this paper, we introduce the DiffuseStyleGesture+, our solution for the Generation and Evaluation of Non-verbal Behavior for Embodied Agents (GENEA) Challenge 2023, which aims to foster the development of realistic, automated systems for generating conversational gestures. Participants are provided with a pre-processed dataset and their systems are evaluated through crowdsourced scoring. Our proposed model, DiffuseStyleGesture+, leverages a diffusion model to generate gestures automatically. It incorporates a variety of modalities, including audio, text, speaker ID, and seed gestures. These diverse modalities are mapped to a hidden space and processed by a modified diffusion model to produce the corresponding gesture for a given speech input. Upon evaluation, the DiffuseStyleGesture+ demonstrated performance on par with the top-tier models in the challenge, showing no significant differences with those models in human-likeness, appropriateness for the interlocutor, and achieving competitive performance with the best model on appropriateness for agent speech. This indicates that our model is competitive and effective in generating realistic and appropriate gestures for given speech. The code, pre-trained models, and demos are available at https://github.com/YoungSeng/DiffuseStyleGesture/tree/DiffuseStyleGesturePlus/BEAT-TWH-main.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3577190.3616114

2308.13879

Country:

Europe (0.48)
Asia > China (0.30)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)

Add feedback