AITopics | Fan, Lu

Collaborating Authors

Fan, Lu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Benchmarking LLMs in Recommendation Tasks: A Comparative Evaluation with Conventional Recommenders

Liu, Qijiong, Zhu, Jieming, Fan, Lu, Wang, Kun, Hu, Hengchang, Guo, Wei, Liu, Yong, Wu, Xiao-Ming

arXiv.org Artificial IntelligenceMar-7-2025

In recent years, integrating large language models (LLMs) into recommender systems has created new opportunities for improving recommendation quality. However, a comprehensive benchmark is needed to thoroughly evaluate and compare the recommendation capabilities of LLMs with traditional recommender systems. In this paper, we introduce RecBench, which systematically investigates various item representation forms (including unique identifier, text, semantic embedding, and semantic identifier) and evaluates two primary recommendation tasks, i.e., click-through rate prediction (CTR) and sequential recommendation (SeqRec). Our extensive experiments cover up to 17 large models and are conducted across five diverse datasets from fashion, news, video, books, and music domains. Our findings indicate that LLM-based recommenders outperform conventional recommenders, achieving up to a 5% AUC improvement in the CTR scenario and up to a 170% NDCG@10 improvement in the SeqRec scenario. However, these substantial performance gains come at the expense of significantly reduced inference efficiency, rendering the LLM-as-RS paradigm impractical for real-time recommendation environments. We aim for our findings to inspire future research, including recommendation-specific model acceleration methods. We will release our code, data, configurations, and platform to enable other researchers to reproduce and build upon our experimental results.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.05493

Country:

Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.54)

Industry:

Leisure & Entertainment (0.87)
Media > Music (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Do self-supervised speech and language models extract similar representations as human brain?

Chen, Peili, He, Linyang, Fu, Li, Fan, Lu, Chang, Edward F., Li, Yuanning

arXiv.org Artificial IntelligenceJan-31-2024

Speech and language models trained through self-supervised learning (SSL) demonstrate strong alignment with brain activity during speech and language perception. However, given their distinct training modalities, it remains unclear whether they correlate with the same neural aspects. We directly address this question by evaluating the brain prediction performance of two representative SSL models, Wav2Vec2.0 and GPT-2, designed for speech and language tasks. Our findings reveal that both models accurately predict speech responses in the auditory cortex, with a significant correlation between their brain predictions. Notably, shared speech contextual information between Wav2Vec2.0 and GPT-2 accounts for the majority of explained variance in brain activity, surpassing static semantic and lower-level acoustic-phonetic information. These results underscore the convergence of speech contextual representations in SSL models and their alignment with the neural network underlying speech perception, offering valuable insights into both SSL models and the neural basis of speech and language processing.

large language model, machine learning, wav2vec2, (17 more...)

arXiv.org Artificial Intelligence

2310.04645

Country:

North America > United States > Michigan (0.28)
North America > United States > California > San Francisco County > San Francisco (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Leveraging Label Information for Multimodal Emotion Recognition

Wang, Peiying, Zeng, Sunlu, Chen, Junqing, Fan, Lu, Chen, Meng, Wu, Youzheng, He, Xiaodong

arXiv.org Artificial IntelligenceSep-5-2023

Multimodal emotion recognition (MER) aims to detect the emotional status of a given expression by combining the speech and text information. Intuitively, label information should be capable of helping the model locate the salient tokens/frames relevant to the specific emotion, which finally facilitates the MER task. Inspired by this, we propose a novel approach for MER by leveraging label information. Specifically, we first obtain the representative label embeddings for both text and speech modalities, then learn the label-enhanced text/speech representations for each utterance via label-token and label-frame interactions. Finally, we devise a novel label-guided attentive fusion module to fuse the label-aware text and speech representations for emotion classification. Extensive experiments were conducted on the public IEMOCAP dataset, and experimental results demonstrate that our proposed approach outperforms existing baselines and achieves new state-of-the-art performance.

machine learning, natural language, recognition, (20 more...)

arXiv.org Artificial Intelligence

2309.02106

Country: Europe (0.28)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.64)
(2 more...)

Add feedback

Multi-modal Pre-training for Medical Vision-language Understanding and Generation: An Empirical Study with A New Benchmark

Xu, Li, Liu, Bo, Khan, Ameer Hamza, Fan, Lu, Wu, Xiao-Ming

arXiv.org Artificial IntelligenceAug-24-2023

With the availability of large-scale, comprehensive, and general-purpose vision-language (VL) datasets such as MSCOCO, vision-language pre-training (VLP) has become an active area of research and proven to be effective for various VL tasks such as visual-question answering. However, studies on VLP in the medical domain have so far been scanty. To provide a comprehensive perspective on VLP for medical VL tasks, we conduct a thorough experimental analysis to study key factors that may affect the performance of VLP with a unified vision-language Transformer. To allow making sound and quick pre-training decisions, we propose RadioGraphy Captions (RGC), a high-quality, multi-modality radiographic dataset containing 18,434 image-caption pairs collected from an open-access online database MedPix. RGC can be used as a pre-training dataset or a new benchmark for medical report generation and medical image-text retrieval. By utilizing RGC and other available datasets for pre-training, we develop several key insights that can guide future medical VLP research and new strong baselines for various medical VL tasks.

machine learning, natural language, question answering, (17 more...)

arXiv.org Artificial Intelligence

2306.06494

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.35)

Add feedback

Neighborhood-based Hard Negative Mining for Sequential Recommendation

Fan, Lu, Pu, Jiashu, Zhang, Rongsheng, Wu, Xiao-Ming

arXiv.org Artificial IntelligenceJun-12-2023

Negative sampling plays a crucial role in training successful sequential recommendation models. Instead of merely employing random negative sample selection, numerous strategies have been proposed to mine informative negative samples to enhance training and performance. However, few of these approaches utilize structural information. In this work, we observe that as training progresses, the distributions of node-pair similarities in different groups with varying degrees of neighborhood overlap change significantly, suggesting that item pairs in distinct groups may possess different negative relationships. Motivated by this observation, we propose a Graph-based Negative sampling approach based on Neighborhood Overlap (GNNO) to exploit structural information hidden in user behaviors for negative mining. GNNO first constructs a global weighted item transition graph using training sequences. Subsequently, it mines hard negative samples based on the degree of overlap with the target item on the graph. Furthermore, GNNO employs curriculum learning to control the hardness of negative samples, progressing from easy to difficult. Extensive experiments on three Amazon benchmarks demonstrate GNNO's effectiveness in consistently enhancing the performance of various state-of-the-art models and surpassing existing negative sampling strategies. The code will be released at \url{https://github.com/floatSDSDS/GNNO}.

data mining, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3539618.3591995

2306.10047

Country:

North America > United States (1.00)
Asia > China > Hong Kong (0.15)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback