AITopics | Gao, Xiaoyang

Collaborating Authors

Gao, Xiaoyang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multimodal Magic Elevating Depression Detection with a Fusion of Text and Audio Intelligence

Gan, Lindy, Huang, Yifan, Gao, Xiaoyang, Tan, Jiaming, Zhao, Fujun, Yang, Tao

arXiv.org Artificial IntelligenceJan-31-2025

ABSTRACT This study proposes an innovative multimodal fusion model based on a teacherstudent architecture to enhance the accuracy of depression classification. Our designed model addresses the limitations of traditional methods in feature fusion and modality weight allocation by introducing multi-head attention mechanisms and weighted multimodal transfer learning. Leveraging the DAIC-WOZ dataset, the student fusion model, guided by textual and auditory teacher models, achieves significant improvements in classification accuracy. Ablation experiments demonstrate that the proposed model attains an F1 score of 99. 1% on the test set, significantly outperforming unimodal and conventional approaches. Our method effectively captures the complementarity between textual and audio features while dynamically adjusting the contributions of the teacher models to enhance generalization capabilities. The experimental results highlight the robustness and adaptability of the proposed framework in handling complex multimodal data. This research provides a novel technical framework for multimodal large model learning in depression analysis, offering new insights into addressing the limitations of existing methods in modality fusion and feature extraction. INTRODUCTION Depression is a significant global health concern that affects millions of individuals across various demographics, leading to considerable social, economic, and health-related impacts. According to the World Health Organization (WHO), depression is one of the leading causes of disability worldwide, with over 264 million people affected.

machine learning, natural language, teacher model, (19 more...)

arXiv.org Artificial Intelligence

2501.16813

Country:

Asia (0.14)
Europe > Russia (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

KwaiYiiMath: Technical Report

Fu, Jiayi, Lin, Lei, Gao, Xiaoyang, Liu, Pengli, Chen, Zhengzong, Yang, Zhirui, Zhang, Shengnan, Zheng, Xue, Li, Yan, Liu, Yuliang, Ye, Xucheng, Liao, Yiqiao, Liao, Chao, Chen, Bin, Song, Chengru, Wan, Junchen, Lin, Zijia, Zhang, Fuzheng, Wang, Zhongyuan, Zhang, Di, Gai, Kun

arXiv.org Artificial IntelligenceOct-19-2023

Recent advancements in large language models (LLMs) have demonstrated remarkable abilities in handling a variety of natural language processing (NLP) downstream tasks, even on mathematical tasks requiring multi-step reasoning. Meanwhile, we also constructed a small-scale Chinese primary school mathematics test set (named KMath), consisting of 188 examples to evaluate the correctness of the problem-solving process generated by the models. Empirical studies demonstrate that KwaiYiiMath can achieve stateof-the-art (SOTA) performance on GSM8k, CMath, and KMath compared with the similar size models, respectively. Recent advances in large language models (LLMs) have revolutionized the natural language processing (NLP) landscape Kenton & Toutanova (2019); Brown et al. (2020), where scaling up model size and the amount of data is one of the key ingredients Rae et al. (2021); Chowdhery et al. (2022); Anil et al. (2023); Touvron et al. (2023a;b). Surprisingly, recent progress suggests that LLMs also have the potential to solve reasoning problems Clark et al. (2020); Talmor et al. (2020); Suzgun et al. (2022); Wei et al. (2022b). In this report, we focus on how to enhance the mathematical reasoning capabilities of LLM through an alignment process that includes supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF). Specifically, we introduce the KwaiYiiMath which is finetuned with human alignment techniques from KwaiYiiBase to tackle mathematical problems. Experimental results show that KwaiYiiMath outperforms many open-source models in similar sizes by a large margin and is approaching GPT-4 on three mathematical benchmarks including both English and Chinese, i.e., GSM8k Cobbe et al. (2021), CMath Wei et al. (2023), and a small-scale in-house dataset KMath. KwaiYiiBase is a large language model developed by Kuaishou https://github.com/kwai/KwaiYii/. Section 3 introduces the methodology of KwaiYiiMath including the process of supervised fine-tuning and human preference alignment. Additionally, it also describes details about the efforts in collecting large amounts of mathematical high-quality training data.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2310.07488

Country: Asia (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Education > Educational Setting > K-12 Education (0.48)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback