AITopics | Chen, Yuyan

Collaborating Authors

Chen, Yuyan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Open-Set Recognition of Novel Species in Biodiversity Monitoring

Chen, Yuyan, Lang, Nico, Schmidt, B. Christian, Jain, Aditya, Basset, Yves, Beery, Sara, Larrivée, Maxim, Rolnick, David

arXiv.org Artificial IntelligenceMar-3-2025

Machine learning is increasingly being applied to facilitate long-term, large-scale biodiversity monitoring. With most species on Earth still undiscovered or poorly documented, species-recognition models are expected to encounter new species during deployment. We introduce Open-Insects, a fine-grained image recognition benchmark dataset for open-set recognition and out-of-distribution detection in biodiversity monitoring. Open-Insects makes it possible to evaluate algorithms for new species detection on several geographical open-set splits with varying difficulty. Furthermore, we present a test set recently collected in the wild with 59 species that are likely new to science. We evaluate a variety of open-set recognition algorithms, including post-hoc methods, training-time regularization, and training with auxiliary data, finding that the simple post-hoc approach of utilizing softmax scores remains a strong baseline. We also demonstrate how to leverage auxiliary data to improve the detection performance when the training dataset is limited. Our results provide timely insights to guide the development of computer vision methods for biodiversity monitoring and species discovery.

artificial intelligence, dataset, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2503.01691

Country:

North America > Canada > Quebec > Montreal (0.50)
Europe > Denmark > Capital Region > Copenhagen (0.40)

Genre: Research Report > New Finding (0.48)

Industry: Food & Agriculture > Agriculture (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

Do Large Language Models have Problem-Solving Capability under Incomplete Information Scenarios?

Chen, Yuyan, Yu, Tianhao, Li, Yueze, Yan, Songzhou, Liu, Sijia, Liang, Jiaqing, Xiao, Yanghua

arXiv.org Artificial IntelligenceSep-23-2024

The evaluation of the problem-solving capability under incomplete information scenarios of Large Language Models (LLMs) is increasingly important, encompassing capabilities such as questioning, knowledge search, error detection, and path planning. Current research mainly focus on LLMs' problem-solving capability such as ``Twenty Questions''. However, these kinds of games do not require recognizing misleading cues which are necessary in the incomplete information scenario. Moreover, the existing game such as ``Who is undercover'' are highly subjective, making it challenging for evaluation. Therefore, in this paper, we introduce a novel game named BrainKing based on the ``Who is undercover'' and ``Twenty Questions'' for evaluating LLM capabilities under incomplete information scenarios. It requires LLMs to identify target entities with limited yes-or-no questions and potential misleading answers. By setting up easy, medium, and hard difficulty modes, we comprehensively assess the performance of LLMs across various aspects. Our results reveal the capabilities and limitations of LLMs in BrainKing, providing significant insights of LLM problem-solving levels.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2409.14762

Country:

Asia (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Media > Music (1.00)
Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

HOTVCOM: Generating Buzzworthy Comments for Videos

Chen, Yuyan, Qian, Yiwen, Yan, Songzhou, Jia, Jiyuan, Li, Zhixu, Xiao, Yanghua, Li, Xiaobo, Yang, Ming, Guo, Qingpei

arXiv.org Artificial IntelligenceSep-23-2024

In the era of social media video platforms, popular ``hot-comments'' play a crucial role in attracting user impressions of short-form videos, making them vital for marketing and branding purpose. However, existing research predominantly focuses on generating descriptive comments or ``danmaku'' in English, offering immediate reactions to specific video moments. Addressing this gap, our study introduces \textsc{HotVCom}, the largest Chinese video hot-comment dataset, comprising 94k diverse videos and 137 million comments. We also present the \texttt{ComHeat} framework, which synergistically integrates visual, auditory, and textual data to generate influential hot-comments on the Chinese video dataset. Empirical evaluations highlight the effectiveness of our framework, demonstrating its excellence on both the newly constructed and existing datasets.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2409.15196

Country:

Asia > China (0.46)
Europe (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(4 more...)

Add feedback

EmotionQueen: A Benchmark for Evaluating Empathy of Large Language Models

Chen, Yuyan, Wang, Hao, Yan, Songzhou, Liu, Sijia, Li, Yueze, Zhao, Yi, Xiao, Yanghua

arXiv.org Artificial IntelligenceSep-20-2024

Emotional intelligence in large language models (LLMs) is of great importance in Natural Language Processing. However, the previous research mainly focus on basic sentiment analysis tasks, such as emotion recognition, which is not enough to evaluate LLMs' overall emotional intelligence. Therefore, this paper presents a novel framework named EmotionQueen for evaluating the emotional intelligence of LLMs. The framework includes four distinctive tasks: Key Event Recognition, Mixed Event Recognition, Implicit Emotional Recognition, and Intention Recognition. LLMs are requested to recognize important event or implicit emotions and generate empathetic response. We also design two metrics to evaluate LLMs' capabilities in recognition and response for emotion-related statements. Experiments yield significant conclusions about LLMs' capabilities and limitations in emotion intelligence.

large language model, machine learning, recognition, (19 more...)

arXiv.org Artificial Intelligence

2409.13359

Country: Asia > China (0.28)

Genre:

Research Report (1.00)
Personal > Interview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Recent Advancement of Emotion Cognition in Large Language Models

Chen, Yuyan, Xiao, Yanghua

arXiv.org Artificial IntelligenceSep-20-2024

Emotion cognition in large language models (LLMs) is crucial for enhancing performance across various applications, such as social media, human-computer interaction, and mental health assessment. We explore the current landscape of research, which primarily revolves around emotion classification, emotionally rich response generation, and Theory of Mind assessments, while acknowledge the challenges like dependency on annotated data and complexity in emotion processing. In this paper, we present a detailed survey of recent progress in LLMs for emotion cognition. We explore key research studies, methodologies, outcomes, and resources, aligning them with Ulric Neisser's cognitive stages. Additionally, we outline potential future directions for research in this evolving field, including unsupervised learning approaches and the development of more complex and interpretable emotion cognition LLMs. We also discuss advanced methods such as contrastive learning used to improve LLMs' emotion cognition capabilities.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2409.13354

Country: Asia > Middle East (0.46)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Online (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Can Pre-trained Language Models Understand Chinese Humor?

Chen, Yuyan, Li, Zhixu, Liang, Jiaqing, Xiao, Yanghua, Liu, Bang, Chen, Yunwen

arXiv.org Artificial IntelligenceJul-4-2024

Humor understanding is an important and challenging research in natural language processing. As the popularity of pre-trained language models (PLMs), some recent work makes preliminary attempts to adopt PLMs for humor recognition and generation. However, these simple attempts do not substantially answer the question: {\em whether PLMs are capable of humor understanding?} This paper is the first work that systematically investigates the humor understanding ability of PLMs. For this purpose, a comprehensive framework with three evaluation steps and four evaluation tasks is designed. We also construct a comprehensive Chinese humor dataset, which can fully meet all the data requirements of the proposed evaluation framework. Our empirical study on the Chinese humor dataset yields some valuable observations, which are of great guiding value for future optimization of PLMs in humor understanding and generation.

machine learning, natural language, plm, (16 more...)

arXiv.org Artificial Intelligence

2407.04105

Country:

Europe (0.93)
Asia (0.73)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Hallucination Detection: Robustly Discerning Reliable Answers in Large Language Models

Chen, Yuyan, Fu, Qiang, Yuan, Yichen, Wen, Zhihao, Fan, Ge, Liu, Dayiheng, Zhang, Dongmei, Li, Zhixu, Xiao, Yanghua

arXiv.org Artificial IntelligenceJul-4-2024

Large Language Models (LLMs) have gained widespread adoption in various natural language processing tasks, including question answering and dialogue systems. However, a major drawback of LLMs is the issue of hallucination, where they generate unfaithful or inconsistent content that deviates from the input source, leading to severe consequences. In this paper, we propose a robust discriminator named RelD to effectively detect hallucination in LLMs' generated answers. RelD is trained on the constructed RelQA, a bilingual question-answering dialogue dataset along with answers generated by LLMs and a comprehensive set of metrics. Our experimental results demonstrate that the proposed RelD successfully detects hallucination in the answers generated by diverse LLMs. Moreover, it performs well in distinguishing hallucination in LLMs' generated answers from both in-distribution and out-of-distribution datasets. Additionally, we also conduct a thorough analysis of the types of hallucinations that occur and present valuable insights. This research significantly contributes to the detection of reliable answers generated by LLMs and holds noteworthy implications for mitigating hallucination in the future work.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2407.04121

Country:

Europe (0.69)
Asia > China (0.69)
North America > United States (0.46)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

MAPO: Boosting Large Language Model Performance with Model-Adaptive Prompt Optimization

Chen, Yuyan, Wen, Zhihao, Fan, Ge, Chen, Zhengyu, Wu, Wei, Liu, Dayiheng, Li, Zhixu, Liu, Bang, Xiao, Yanghua

arXiv.org Artificial IntelligenceJul-4-2024

Prompt engineering, as an efficient and effective way to leverage Large Language Models (LLM), has drawn a lot of attention from the research community. The existing research primarily emphasizes the importance of adapting prompts to specific tasks, rather than specific LLMs. However, a good prompt is not solely defined by its wording, but also binds to the nature of the LLM in question. In this work, we first quantitatively demonstrate that different prompts should be adapted to different LLMs to enhance their capabilities across various downstream tasks in NLP. Then we novelly propose a model-adaptive prompt optimizer (MAPO) method that optimizes the original prompts for each specific LLM in downstream tasks. Extensive experiments indicate that the proposed method can effectively refine prompts for an LLM, leading to significant improvements over various downstream tasks.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2407.04118

Country:

Asia (0.67)
Europe > Poland (0.28)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

AICoderEval: Improving AI Domain Code Generation of Large Language Models

Xia, Yinghui, Chen, Yuyan, Shi, Tianyu, Wang, Jun, Yang, Jinsong

arXiv.org Artificial IntelligenceJun-7-2024

Automated code generation is a pivotal capability of large language models (LLMs). However, assessing this capability in real-world scenarios remains challenging. Previous methods focus more on low-level code generation, such as model loading, instead of generating high-level codes catering for real-world tasks, such as image-to-text, text classification, in various domains. Therefore, we construct AICoderEval, a dataset focused on real-world tasks in various domains based on HuggingFace, PyTorch, and TensorFlow, along with comprehensive metrics for evaluation and enhancing LLMs' task-specific code generation capability. AICoderEval contains test cases and complete programs for automated evaluation of these tasks, covering domains such as natural language processing, computer vision, and multimodal learning. To facilitate research in this area, we open-source the AICoderEval dataset at \url{https://huggingface.co/datasets/vixuowis/AICoderEval}. After that, we propose CoderGen, an agent-based framework, to help LLMs generate codes related to real-world tasks on the constructed AICoderEval. Moreover, we train a more powerful task-specific code generation model, named AICoder, which is refined on llama-3 based on AICoderEval. Our experiments demonstrate the effectiveness of CoderGen in improving LLMs' task-specific code generation capability (by 12.00\% on pass@1 for original model and 9.50\% on pass@1 for ReAct Agent). AICoder also outperforms current code generation LLMs, indicating the great quality of the AICoderEval benchmark.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.04712

Country:

North America > United States (0.15)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ProSwitch: Knowledge-Guided Instruction Tuning to Generate Professional and Non-Professional Styled Text

Zong, Chang, Chen, Yuyan, Lu, Weiming, Shao, Jian, Zhuang, Yueting

arXiv.org Artificial IntelligenceApr-15-2024

Large Language Models (LLMs) have demonstrated efficacy in various linguistic applications, including text summarization and controlled text generation. However, studies into their capacity of switching between styles via fine-tuning remain underexplored. This study concentrates on textual professionalism and introduces a novel methodology, named ProSwitch, which equips a language model with the ability to produce both professional and non-professional responses through knowledge-guided instruction tuning. ProSwitch unfolds across three phases: data preparation for gathering domain knowledge and training corpus; instruction tuning for optimizing language models with multiple levels of instruction formats; and comprehensive evaluation for assessing the professionalism discrimination and reference-based quality of generated text. Comparative analysis of ProSwitch against both general and specialized language models reveals that our approach outperforms baselines in switching between professional and non-professional text generation.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2403.09131

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback