AITopics | Li, Dongyuan

Collaborating Authors

Li, Dongyuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LAG: LLM agents for Leaderboard Auto Generation on Demanding

Wu, Jian, Zhang, Jiayu, Li, Dongyuan, Yang, Linyi, Zhong, Aoxiao, Jiang, Renhe, Wen, Qingsong, Zhang, Yue

arXiv.org Artificial IntelligenceFeb-25-2025

This paper introduces Leaderboard Auto Generation (LAG), a novel and well-organized framework for automatic generation of leaderboards on a given research topic in rapidly evolving fields like Artificial Intelligence (AI). Faced with a large number of AI papers updated daily, it becomes difficult for researchers to track every paper's proposed methods, experimental results, and settings, prompting the need for efficient automatic leaderboard construction. While large language models (LLMs) offer promise in automating this process, challenges such as multi-document summarization, leaderboard generation, and experiment fair comparison still remain under exploration. LAG solves these challenges through a systematic approach that involves the paper collection, experiment results extraction and integration, leaderboard generation, and quality evaluation. Our contributions include a comprehensive solution to the leaderboard construction problem, a reliable evaluation method, and experimental results showing the high quality of leaderboards.

large language model, leaderboard, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.18209

Country:

Asia (0.14)
Europe > Italy (0.14)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Multi-Agent Autonomous Driving Systems with Large Language Models: A Survey of Recent Advances

Wu, Yaozu, Li, Dongyuan, Chen, Yankai, Jiang, Renhe, Zou, Henry Peng, Fang, Liancheng, Wang, Zhen, Yu, Philip S.

arXiv.org Artificial IntelligenceFeb-23-2025

Autonomous Driving Systems (ADSs) are revolutionizing transportation by reducing human intervention, improving operational efficiency, and enhancing safety. Large Language Models (LLMs), known for their exceptional planning and reasoning capabilities, have been integrated into ADSs to assist with driving decision-making. However, LLM-based single-agent ADSs face three major challenges: limited perception, insufficient collaboration, and high computational demands. To address these issues, recent advancements in LLM-based multi-agent ADSs have focused on improving inter-agent communication and cooperation. This paper provides a frontier survey of LLM-based multi-agent ADSs. We begin with a background introduction to related concepts, followed by a categorization of existing LLM-based approaches based on different agent interaction modes. We then discuss agent-human interactions in scenarios where LLM-based agents engage with humans. Finally, we summarize key applications, datasets, and challenges in this field to support future research (https://anonymous.4open.science/r/LLM-based_Multi-agent_ADS-3A5C/README.md).

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.16804

Country:

Asia (0.14)
North America > United States > Illinois (0.14)

Genre: Overview (1.00)

Industry: Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Revisiting Dynamic Graph Clustering via Matrix Factorization

Li, Dongyuan, Kosugi, Satoshi, Zhang, Ying, Okumura, Manabu, Xia, Feng, Jiang, Renhe

arXiv.org Machine LearningFeb-9-2025

Dynamic graph clustering aims to detect and track time-varying clusters in dynamic graphs, revealing the evolutionary mechanisms of complex real-world dynamic systems. Matrix factorization-based methods are promising approaches for this task; however, these methods often struggle with scalability and can be time-consuming when applied to large-scale dynamic graphs. Moreover, they tend to lack robustness and are vulnerable to real-world noisy data. To address these issues, we make three key contributions. First, to improve scalability, we propose temporal separated matrix factorization, where a single matrix is divided into multiple smaller matrices for independent factorization, resulting in faster computation. Second, to improve robustness, we introduce bi-clustering regularization, which jointly optimizes graph embedding and clustering, thereby filtering out noisy features from the graph embeddings. Third, to further enhance effectiveness and efficiency, we propose selective embedding updating, where we update only the embeddings of dynamic nodes while the embeddings of static nodes are fixed among different timestamps. Experimental results on six synthetic and five real-world benchmarks demonstrate the scalability, robustness and effectiveness of our proposed method. Source code is available at https://github.com/Clearloveyuan/DyG-MF.

artificial intelligence, graph, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1145/3696410.3714646

2502.06117

Country:

Oceania > Australia (0.30)
Asia (0.29)

Genre: Research Report > Experimental Study (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

A Survey on Deep Active Learning: Recent Advances and New Frontiers

Li, Dongyuan, Wang, Zhen, Chen, Yankai, Jiang, Renhe, Ding, Weiping, Okumura, Manabu

arXiv.org Artificial IntelligenceJul-15-2024

Active learning seeks to achieve strong performance with fewer training samples. It does this by iteratively asking an oracle to label new selected samples in a human-in-the-loop manner. This technique has gained increasing popularity due to its broad applicability, yet its survey papers, especially for deep learning-based active learning (DAL), remain scarce. Therefore, we conduct an advanced and comprehensive survey on DAL. We first introduce reviewed paper collection and filtering. Second, we formally define the DAL task and summarize the most influential baselines and widely used datasets. Third, we systematically provide a taxonomy of DAL methods from five perspectives, including annotation types, query strategies, deep model architectures, learning paradigms, and training processes, and objectively analyze their strengths and weaknesses. Then, we comprehensively summarize main applications of DAL in Natural Language Processing (NLP), Computer Vision (CV), and Data Mining (DM), etc. Finally, we discuss challenges and perspectives after a detailed analysis of current studies. This work aims to serve as a useful and quick guide for researchers in overcoming difficulties in DAL. We hope that this survey will spur further progress in this burgeoning field.

deep active learning, deep learning, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2405.00334

Genre:

Overview (0.53)
Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Community-Invariant Graph Contrastive Learning

Tan, Shiyin, Li, Dongyuan, Jiang, Renhe, Zhang, Ying, Okumura, Manabu

arXiv.org Artificial IntelligenceMay-2-2024

Graph augmentation has received great attention in recent years for graph contrastive learning (GCL) to learn well-generalized node/graph representations. However, mainstream GCL methods often favor randomly disrupting graphs for augmentation, which shows limited generalization and inevitably leads to the corruption of high-level graph information, i.e., the graph community. Moreover, current knowledge-based graph augmentation methods can only focus on either topology or node features, causing the model to lack robustness against various types of noise. To address these limitations, this research investigated the role of the graph community in graph augmentation and figured out its crucial advantage for learnable graph augmentation. Based on our observations, we propose a community-invariant GCL framework to maintain graph community structure during learnable graph augmentation. By maximizing the spectral changes, this framework unifies the constraints of both topology and feature augmentation, enhancing the model's robustness. Empirical evidence on 21 benchmark datasets demonstrates the exclusive merits of our framework. Code is released on Github (https://github.com/ShiyinTan/CI-GCL.git).

artificial intelligence, augmentation, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2405.0135

Country:

North America > United States (0.28)
Asia > Japan > Honshū (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.46)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Active Learning with Task Adaptation Pre-training for Speech Emotion Recognition

Li, Dongyuan, Zhang, Ying, Wang, Yusong, Kataro, Funakoshi, Okumura, Manabu

arXiv.org Artificial IntelligenceMay-1-2024

Speech emotion recognition (SER) has garnered increasing attention due to its wide range of applications in various fields, including human-machine interaction, virtual assistants, and mental health assistance. However, existing SER methods often overlook the information gap between the pre-training speech recognition task and the downstream SER task, resulting in sub-optimal performance. Moreover, current methods require much time for fine-tuning on each specific speech dataset, such as IEMOCAP, which limits their effectiveness in real-world scenarios with large-scale noisy data. To address these issues, we propose an active learning (AL)-based fine-tuning framework for SER, called \textsc{After}, that leverages task adaptation pre-training (TAPT) and AL methods to enhance performance and efficiency. Specifically, we first use TAPT to minimize the information gap between the pre-training speech recognition task and the downstream speech emotion recognition task. Then, AL methods are employed to iteratively select a subset of the most informative and diverse samples for fine-tuning, thereby reducing time consumption. Experiments demonstrate that our proposed method \textsc{After}, using only 20\% of samples, improves accuracy by 8.45\% and reduces time consumption by 79\%. The additional extension of \textsc{After} and ablation studies further confirm its effectiveness and applicability to various real-world scenarios. Our source code is available on Github for reproducibility. (https://github.com/Clearloveyuan/AFTER).

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2405.00307

Country:

Asia > Japan > Honshū (0.14)
North America > United States > Texas (0.14)
Europe > Spain (0.14)
(3 more...)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.66)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Add feedback

Joyful: Joint Modality Fusion and Graph Contrastive Learning for Multimodal Emotion Recognition

Li, Dongyuan, Wang, Yusong, Funakoshi, Kotaro, Okumura, Manabu

arXiv.org Artificial IntelligenceNov-18-2023

Multimodal emotion recognition aims to recognize emotions for each utterance of multiple modalities, which has received increasing attention for its application in human-machine interaction. Current graph-based methods fail to simultaneously depict global contextual features and local diverse uni-modal features in a dialogue. Furthermore, with the number of graph layers increasing, they easily fall into over-smoothing. In this paper, we propose a method for joint modality fusion and graph contrastive learning for multimodal emotion recognition (Joyful), where multimodality fusion, contrastive learning, and emotion recognition are jointly optimized. Specifically, we first design a new multimodal fusion mechanism that can provide deep interaction and fusion between the global contextual and uni-modal specific features. Then, we introduce a graph contrastive learning framework with inter-view and intra-view contrastive losses to learn more distinguishable representations for samples with different sentiments. Extensive experiments on three benchmark datasets indicate that Joyful achieved state-of-the-art (SOTA) performance compared to all baselines.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2311.11009

Country: Asia > Japan (0.14)

Genre: Research Report > Experimental Study (0.68)

Industry:

Media (0.67)
Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Active Learning Based Fine-Tuning Framework for Speech Emotion Recognition

Li, Dongyuan, Wang, Yusong, Funakoshi, Kotaro, Okumura, Manabu

arXiv.org Artificial IntelligenceSep-30-2023

Speech emotion recognition (SER) has drawn increasing attention for its applications in human-machine interaction. However, existing SER methods ignore the information gap between the pre-training speech recognition task and the downstream SER task, leading to sub-optimal performance. Moreover, they require much time to fine-tune on each specific speech dataset, restricting their effectiveness in real-world scenes with large-scale noisy data. To address these issues, we propose an active learning (AL) based Fine-Tuning framework for SER that leverages task adaptation pre-training (TAPT) and AL methods to enhance performance and efficiency. Specifically, we first use TAPT to minimize the information gap between the pre-training and the downstream task. Then, AL methods are used to iteratively select a subset of the most informative and diverse samples for fine-tuning, reducing time consumption. Experiments demonstrate that using only 20\%pt. samples improves 8.45\%pt. accuracy and reduces 79\%pt. time consumption.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2310.00283

Country: Asia (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.64)

Add feedback