AITopics | Wang, Yingyao

Collaborating Authors

Wang, Yingyao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games

Chen, Peng, Bu, Pi, Wang, Yingyao, Wang, Xinyi, Wang, Ziming, Guo, Jie, Zhao, Yingxiu, Zhu, Qi, Song, Jun, Yang, Siran, Wang, Jiamang, Zheng, Bo

arXiv.org Artificial IntelligenceMar-12-2025

Recent advances in Vision-Language-Action models (VLAs) have expanded the capabilities of embodied intelligence. However, significant challenges remain in real-time decision-making in complex 3D environments, which demand second-level responses, high-resolution perception, and tactical reasoning under dynamic conditions. To advance the field, we introduce CombatVLA, an efficient VLA model optimized for combat tasks in 3D action role-playing games(ARPGs). Specifically, our CombatVLA is a 3B model trained on video-action pairs collected by an action tracker, where the data is formatted as action-of-thought (AoT) sequences. Thereafter, CombatVLA seamlessly integrates into an action execution framework, allowing efficient inference through our truncated AoT strategy. Experimental results demonstrate that CombatVLA not only outperforms all existing models on the combat understanding benchmark but also achieves a 50-fold acceleration in game combat. Moreover, it has a higher task success rate than human players. We will open-source all resources, including the action tracker, dataset, benchmark, model weights, training code, and the implementation of the framework at https://combatvla.github.io/.

large language model, machine learning, real time system, (23 more...)

arXiv.org Artificial Intelligence

2503.09527

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
(4 more...)

Add feedback

"See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models

Gu, Jihao, Wang, Yingyao, Bu, Pi, Wang, Chen, Wang, Ziming, Song, Tengtao, Wei, Donglai, Yuan, Jiale, Zhao, Yingxiu, He, Yancheng, Li, Shilong, Liu, Jiaheng, Cao, Meng, Song, Jun, Tan, Yingshui, Li, Xiang, Su, Wenbo, Zheng, Zhicheng, Zhu, Xiaoyong, Zheng, Bo

arXiv.org Artificial IntelligenceFeb-17-2025

The evaluation of factual accuracy in large vision language models (LVLMs) has lagged behind their rapid development, making it challenging to fully reflect these models' knowledge capacity and reliability. In this paper, we introduce the first factuality-based visual question-answering benchmark in Chinese, named ChineseSimpleVQA, aimed at assessing the visual factuality of LVLMs across 8 major topics and 56 subtopics. The key features of this benchmark include a focus on the Chinese language, diverse knowledge types, a multi-hop question construction, high-quality data, static consistency, and easy-to-evaluate through short answers. Moreover, we contribute a rigorous data construction pipeline and decouple the visual factuality into two parts: seeing the world (i.e., object recognition) and discovering knowledge. This decoupling allows us to analyze the capability boundaries and execution mechanisms of LVLMs. Subsequently, we evaluate 34 advanced open-source and closed-source models, revealing critical performance gaps within this field.

artificial intelligence, chinese factuality evaluation, natural language, (13 more...)

arXiv.org Artificial Intelligence

2502.11718

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (0.89)
Information Technology > Artificial Intelligence > Natural Language (0.89)
Information Technology > Data Science > Data Mining > Knowledge Discovery (0.40)

Add feedback

HopPG: Self-Iterative Program Generation for Multi-Hop Question Answering over Heterogeneous Knowledge

Wang, Yingyao, Zhou, Yongwei, Duan, Chaoqun, Bao, Junwei, Zhao, Tiejun

arXiv.org Artificial IntelligenceSep-10-2023

The semantic parsing-based method is an important research branch for knowledge-based question answering. It usually generates executable programs lean upon the question and then conduct them to reason answers over a knowledge base. Benefit from this inherent mechanism, it has advantages in the performance and the interpretability. However, traditional semantic parsing methods usually generate a complete program before executing it, which struggles with multi-hop question answering over heterogeneous knowledge. On one hand, generating a complete multi-hop program relies on multiple heterogeneous supporting facts, and it is difficult for generators to understand these facts simultaneously. On the other hand, this way ignores the semantic information of the intermediate answers at each hop, which is beneficial for subsequent generation. To alleviate these challenges, we propose a self-iterative framework for multi-hop program generation (HopPG) over heterogeneous knowledge, which leverages the previous execution results to retrieve supporting facts and generate subsequent programs hop by hop. We evaluate our model on MMQA-T^2, and the experimental results show that HopPG outperforms existing semantic-parsing-based baselines, especially on the multi-hop questions.

artificial intelligence, natural language, question answering, (18 more...)

arXiv.org Artificial Intelligence

2308.11257

Country:

Europe > Ireland (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback