AITopics | Yang, Junwei

Collaborating Authors

Yang, Junwei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Large Language Model Agent: A Survey on Methodology, Applications and Challenges

Luo, Junyu, Zhang, Weizhi, Yuan, Ye, Zhao, Yusheng, Yang, Junwei, Gu, Yiyang, Wu, Bohan, Chen, Binqi, Qiao, Ziyue, Long, Qingqing, Tu, Rongcheng, Luo, Xiao, Ju, Wei, Xiao, Zhiping, Wang, Yifan, Xiao, Meng, Liu, Chenwu, Yuan, Jingyang, Zhang, Shichang, Jin, Yiqiao, Zhang, Fan, Wu, Xian, Zhao, Hanqing, Tao, Dacheng, Yu, Philip S., Zhang, Ming

arXiv.org Artificial IntelligenceMar-27-2025

The era of intelligent agents is upon us, driven by revolutionary advancements in large language models. Large Language Model (LLM) agents, with goal-driven behaviors and dynamic adaptation capabilities, potentially represent a critical pathway toward artificial general intelligence. This survey systematically deconstructs LLM agent systems through a methodology-centered taxonomy, linking architectural foundations, collaboration mechanisms, and evolutionary pathways. We unify fragmented research threads by revealing fundamental connections between agent design principles and their emergent behaviors in complex environments. Our work provides a unified architectural perspective, examining how agents are constructed, how they collaborate, and how they evolve over time, while also addressing evaluation methodologies, tool applications, practical challenges, and diverse application domains. By surveying the latest developments in this rapidly evolving field, we offer researchers a structured taxonomy for understanding LLM agents and identify promising directions for future research. The collection is available at https://github.com/luo-junyu/Awesome-Agent-Papers.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.2146

Country:

North America > United States (1.00)
Asia (0.93)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.87)

Add feedback

ExLM: Rethinking the Impact of [MASK] Tokens in Masked Language Models

Zheng, Kangjie, Yang, Junwei, Liang, Siyue, Feng, Bin, Liu, Zequn, Ju, Wei, Xiao, Zhiping, Zhang, Ming

arXiv.org Artificial IntelligenceFeb-5-2025

Masked Language Models (MLMs) have achieved remarkable success in many self-supervised representation learning tasks. MLMs are trained by randomly masking portions of the input sequences with [MASK] tokens and learning to reconstruct the original content based on the remaining context. This paper explores the impact of [MASK] tokens on MLMs. Analytical studies show that masking tokens can introduce the corrupted semantics problem, wherein the corrupted context may convey multiple, ambiguous meanings. This problem is also a key factor affecting the performance of MLMs on downstream tasks. Based on these findings, we propose a novel enhanced-context MLM, ExLM. Our approach expands [MASK] tokens in the input context and models the dependencies between these expanded states. This enhancement increases context capacity and enables the model to capture richer semantic information, effectively mitigating the corrupted semantics problem during pre-training. Experimental results demonstrate that ExLM achieves significant performance improvements in both text modeling and SMILES modeling tasks. Further analysis confirms that ExLM enriches semantic representations through context enhancement, and effectively reduces the semantic multimodality commonly observed in MLMs.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.13397

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision

Zheng, Kangjie, Liang, Siyue, Yang, Junwei, Feng, Bin, Liu, Zequn, Ju, Wei, Xiao, Zhiping, Zhang, Ming

arXiv.org Artificial IntelligenceDec-7-2024

SMILES, a crucial textual representation of molecular structures, has garnered significant attention as a foundation for pre-trained language models (LMs). However, most existing pre-trained SMILES LMs focus solely on the single-token level supervision during pre-training, failing to fully leverage the substructural information of molecules. This limitation makes the pre-training task overly simplistic, preventing the models from capturing richer molecular semantic information. Moreover, during pre-training, these SMILES LMs only process corrupted SMILES inputs, never encountering any valid SMILES, which leads to a train-inference mismatch. To address these challenges, we propose SMI-Editor, a novel edit-based pre-trained SMILES LM. SMI-Editor disrupts substructures within a molecule at random and feeds the resulting SMILES back into the model, which then attempts to restore the original SMILES through an editing process. This approach not only introduces fragment-level training signals, but also enables the use of valid SMILES as inputs, allowing the model to learn how to reconstruct complete molecules from these incomplete structures. As a result, the model demonstrates improved scalability and an enhanced ability to capture fragment-level molecular information. Experimental results show that SMI-Editor achieves state-of-the-art performance across multiple downstream molecular tasks, and even outperforming several 3D molecular representation models.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.05569

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.93)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.89)

Add feedback

ESM All-Atom: Multi-scale Protein Language Model for Unified Molecular Modeling

Zheng, Kangjie, Long, Siyu, Lu, Tianyu, Yang, Junwei, Dai, Xinyu, Zhang, Ming, Nie, Zaiqing, Ma, Wei-Ying, Zhou, Hao

arXiv.org Artificial IntelligenceJun-12-2024

Protein language models have demonstrated significant potential in the field of protein engineering. However, current protein language models primarily operate at the residue scale, which limits their ability to provide information at the atom level. This limitation prevents us from fully exploiting the capabilities of protein language models for applications involving both proteins and small molecules. In this paper, we propose ESM-AA (ESM All-Atom), a novel approach that enables atom-scale and residue-scale unified molecular modeling. ESM-AA achieves this by pre-training on multi-scale code-switch protein sequences and utilizing a multi-scale position encoding to capture relationships among residues and atoms. Experimental results indicate that ESM-AA surpasses previous methods in protein-molecule tasks, demonstrating the full utilization of protein language models. Further investigations reveal that through unified molecular modeling, ESM-AA not only gains molecular knowledge but also retains its understanding of proteins. The source codes of ESM-AA are publicly released at https://github.com/zhengkangjie/ESM-AA.

artificial intelligence, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2403.12995

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Towards Graph Contrastive Learning: A Survey and Beyond

Ju, Wei, Wang, Yifan, Qin, Yifang, Mao, Zhengyang, Xiao, Zhiping, Luo, Junyu, Yang, Junwei, Gu, Yiyang, Wang, Dongjie, Long, Qingqing, Yi, Siyu, Luo, Xiao, Zhang, Ming

arXiv.org Artificial IntelligenceMay-20-2024

In recent years, deep learning on graphs has achieved remarkable success in various domains. However, the reliance on annotated graph data remains a significant bottleneck due to its prohibitive cost and time-intensive nature. To address this challenge, self-supervised learning (SSL) on graphs has gained increasing attention and has made significant progress. SSL enables machine learning models to produce informative representations from unlabeled graph data, reducing the reliance on expensive labeled data. While SSL on graphs has witnessed widespread adoption, one critical component, Graph Contrastive Learning (GCL), has not been thoroughly investigated in the existing literature. Thus, this survey aims to fill this gap by offering a dedicated survey on GCL. We provide a comprehensive overview of the fundamental principles of GCL, including data augmentation strategies, contrastive modes, and contrastive optimization objectives. Furthermore, we explore the extensions of GCL to other aspects of data-efficient graph learning, such as weakly supervised learning, transfer learning, and related scenarios. We also discuss practical applications spanning domains such as drug discovery, genomics analysis, recommender systems, and finally outline the challenges and potential future directions in this field.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2405.11868

Country:

Asia (0.46)
North America > United States > California (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.87)

Add feedback

A Comprehensive Survey on Deep Graph Representation Learning

Ju, Wei, Fang, Zheng, Gu, Yiyang, Liu, Zequn, Long, Qingqing, Qiao, Ziyue, Qin, Yifang, Shen, Jianhao, Sun, Fang, Xiao, Zhiping, Yang, Junwei, Yuan, Jingyang, Zhao, Yusheng, Luo, Xiao, Zhang, Ming

arXiv.org Artificial IntelligenceApr-19-2023

Graph representation learning aims to effectively encode high-dimensional sparse graph-structured data into low-dimensional dense vectors, which is a fundamental task that has been widely studied in a range of fields, including machine learning and data mining. Classic graph embedding methods follow the basic idea that the embedding vectors of interconnected nodes in the graph can still maintain a relatively close distance, thereby preserving the structural information between the nodes in the graph. However, this is sub-optimal due to: (i) traditional methods have limited model capacity which limits the learning performance; (ii) existing techniques typically rely on unsupervised learning strategies and fail to couple with the latest learning paradigms; (iii) representation learning and downstream tasks are dependent on each other which should be jointly enhanced. With the remarkable success of deep learning, deep graph representation learning has shown great potential and advantages over shallow (traditional) methods, there exist a large number of deep graph representation learning techniques have been proposed in the past decade, especially graph neural networks. In this survey, we conduct a comprehensive survey on current deep graph representation learning algorithms by proposing a new taxonomy of existing state-of-the-art literature. Specifically, we systematically summarize the essential components of graph representation learning and categorize existing approaches by the ways of graph neural network architectures and the most recent advanced learning paradigms. Moreover, this survey also provides the practical and promising applications of deep graph representation learning. Last but not least, we state new perspectives and suggest challenging directions which deserve further investigations in the future.

artificial intelligence, machine learning, survey article, (18 more...)

arXiv.org Artificial Intelligence

2304.05055

Country:

Europe (0.92)
Asia > China (0.67)
North America > United States (0.46)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Information Technology > Services (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.92)
(5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Local Trajectory Replanning and Control for Robot Navigation

Pokle, Ashwini, Martín-Martín, Roberto, Goebel, Patrick, Chow, Vincent, Ewald, Hans M., Yang, Junwei, Wang, Zhenkai, Sadeghian, Amir, Sadigh, Dorsa, Savarese, Silvio, Vázquez, Marynel

arXiv.org Artificial IntelligenceMay-13-2019

We present a navigation system that combines ideas from hierarchical planning and machine learning. The system uses a traditional global planner to compute optimal paths towards a goal, and a deep local trajectory planner and velocity controller to compute motion commands. The latter components of the system adjust the behavior of the robot through attention mechanisms such that it moves towards the goal, avoids obstacles, and respects the space of nearby pedestrians. Both the structure of the proposed deep models and the use of attention mechanisms make the system's execution interpretable. Our simulation experiments suggest that the proposed architecture outperforms baselines that try to map global plan information and sensor data directly to velocity commands. In comparison to a hand-designed traditional navigation system, the proposed approach showed more consistent performance.

artificial intelligence, navigation, neural network, (17 more...)

arXiv.org Artificial Intelligence

1905.05279

Country: North America > United States (0.93)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.48)

Industry:

Transportation (0.46)
Automobiles & Trucks (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback