AITopics | Tang, Ruiming

Collaborating Authors

Tang, Ruiming

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger

Li, Wenjun, Li, Dexun, Dong, Kuicai, Zhang, Cong, Zhang, Hao, Liu, Weiwen, Wang, Yasheng, Tang, Ruiming, Liu, Yong

arXiv.org Artificial IntelligenceFeb-18-2025

Large language models (LLMs) have shown remarkable emergent capabilities, transforming the execution of functional tasks by leveraging external tools for complex problems that require specialized processing or real-time data. While existing research expands LLMs access to diverse tools (e.g., program interpreters, search engines, weather/map apps), the necessity of using these tools is often overlooked, leading to indiscriminate tool invocation. This naive approach raises two key issues:(1) increased delays due to unnecessary tool calls, and (2) potential errors resulting from faulty interactions with external tools. In this paper, we introduce meta-cognition as a proxy for LLMs self-assessment of their capabilities, representing the model's awareness of its own limitations. Based on this, we propose MeCo, an adaptive decision-making strategy for external tool use. MeCo quantifies metacognitive scores by capturing high-level cognitive signals in the representation space, guiding when to invoke tools. Notably, MeCo is fine-tuning-free and incurs minimal cost. Our experiments show that MeCo accurately detects LLMs' internal cognitive signals and significantly improves tool-use decision-making across multiple base models and benchmarks.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.12961

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation

Du, Kounianhua, Wang, Hanjing, Liu, Jianxing, Chen, Jizheng, Dai, Xinyi, Wang, Yasheng, Tang, Ruiming, Yu, Yong, Wang, Jun, Zhang, Weinan

arXiv.org Artificial IntelligenceFeb-17-2025

Large language models (LLMs) have demonstrated remarkable capabilities in various domains, particularly in system 1 tasks, yet the intricacies of their problem-solving mechanisms in system 2 tasks are not sufficiently explored. Recent research on System2-to-System1 methods surge, exploring the System 2 reasoning knowledge via inference-time computation and compressing the explored knowledge into System 1 process. In this paper, we focus on code generation, which is a representative System 2 task, and identify two primary challenges: (1) the complex hidden reasoning processes and (2) the heterogeneous data distributions that complicate the exploration and training of robust LLM solvers. To tackle these issues, we propose a novel BDC framework that explores insightful System 2 knowledge of LLMs using a MC-Tree-Of-Agents algorithm with mutual \textbf{B}oosting, \textbf{D}isentangles the heterogeneous training data for composable LoRA-experts, and obtain \textbf{C}ustomized problem solver for each data instance with an input-aware hypernetwork to weight over the LoRA-experts, offering effectiveness, flexibility, and robustness. This framework leverages multiple LLMs through mutual verification and boosting, integrated into a Monte-Carlo Tree Search process enhanced by reflection-based pruning and refinement. Additionally, we introduce the DisenLora algorithm, which clusters heterogeneous data to fine-tune LLMs into composable Lora experts, enabling the adaptive generation of customized problem solvers through an input-aware hypernetwork. This work lays the groundwork for advancing LLM capabilities in complex reasoning tasks, offering a novel System2-to-System1 solution.

large language model, natural language, robust system2-to-system1 pipeline, (3 more...)

arXiv.org Artificial Intelligence

2502.12492

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge

Zhang, Qiyuan, Wang, Yufei, Jiang, Yuxin, Li, Liangyou, Wu, Chuhan, Wang, Yasheng, Jiang, Xin, Shang, Lifeng, Tang, Ruiming, Lyu, Fuyuan, Ma, Chen

arXiv.org Artificial IntelligenceFeb-17-2025

LLM-as-a-Judge, which generates chain-of-thought (CoT) judgments, has become a widely adopted auto-evaluation method. However, its reliability is compromised by the CoT reasoning's inability to capture comprehensive and deeper details, often leading to incomplete outcomes. Existing methods mainly rely on majority voting or criteria expansion, which is insufficient to address the limitation in CoT. We propose Crowd-based Comparative Evaluation, which introduces additional crowd responses to compare with the candidate responses, thereby exposing deeper and more comprehensive details within the candidate responses. This process effectively guides LLM-as-a-Judge to provide a more detailed CoT judgment. Extensive experiments demonstrate that our approach enhances evaluation reliability, achieving an average accuracy gain of 6.7% across five benchmarks. Moreover, our method produces higher-quality CoTs that facilitate judge distillation and exhibit superior performance in rejection sampling for supervised fine-tuning (SFT), referred to as crowd rejection sampling, thereby enabling more efficient SFT. Our analysis confirms that CoTs generated by ours are more comprehensive and of higher quality, and evaluation accuracy improves as inference scales.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.12501

Country: Asia (0.46)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Full-Stack Optimized Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

Shan, Rong, Zhu, Jiachen, Lin, Jianghao, Zhu, Chenxu, Chen, Bo, Tang, Ruiming, Yu, Yong, Zhang, Weinan

arXiv.org Artificial IntelligenceJan-22-2025

In this paper, we address the lifelong sequential behavior incomprehension problem in large language models (LLMs) for recommendation, where LLMs struggle to extract useful information from long user behavior sequences, even within their context limits. To tackle this, we propose ReLLaX (Retrieval-enhanced Large Language models Plus), a framework offering optimization across data, prompt, and parameter levels. At the data level, we introduce Semantic User Behavior Retrieval (SUBR) to reduce sequence heterogeneity, making it easier for LLMs to extract key information. For prompt-level enhancement, we employ Soft Prompt Augmentation (SPA) to inject collaborative knowledge, aligning item representations with recommendation tasks and improving LLMs's exploration of item relationships. Finally, at the parameter level, we propose Component Fully-interactive LoRA (CFLoRA), which enhances LoRA's expressiveness by enabling interactions between its components, allowing better capture of sequential information. Moreover, we present new perspectives to compare current LoRA-based LLM4Rec methods, i.e. from both a composite and a decomposed view. We theoretically demonstrate that the ways they employ LoRA for recommendation are degraded versions of our CFLoRA, with different constraints on atom component interactions. Extensive experiments on three public datasets demonstrate ReLLaX's superiority over existing baselines and its ability to mitigate lifelong sequential behavior incomprehension effectively.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.13344

Country: Asia > China (0.29)

Genre:

Research Report (1.00)
Overview (0.67)

Industry:

Media > Film (0.95)
Leisure & Entertainment (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Survey on Multi-Turn Interaction Capabilities of Large Language Models

Zhang, Chen, Dai, Xinyi, Wu, Yaxiong, Yang, Qu, Wang, Yasheng, Tang, Ruiming, Liu, Yong

arXiv.org Artificial IntelligenceJan-17-2025

Multi-turn interaction in the dialogue system research refers to a system's ability to maintain context across multiple dialogue turns, enabling it to generate coherent and contextually relevant responses. Recent advancements in large language models (LLMs) have significantly expanded the scope of multi-turn interaction, moving beyond chatbots to enable more dynamic agentic interactions with users or environments. In this paper, we provide a focused review of the multi-turn capabilities of LLMs, which are critical for a wide range of downstream applications, including conversational search and recommendation, consultation services, and interactive tutoring. This survey explores four key aspects: (1) the core model capabilities that contribute to effective multi-turn interaction, (2) how multi-turn interaction is evaluated in current practice, (3) the general algorithms used to enhance multi-turn interaction, and (4) potential future directions for research in this field.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2501.09959

Country:

Asia (0.69)
North America > Mexico > Mexico City (0.14)
North America > United States > New York (0.14)
Europe > Middle East > Malta (0.14)

Genre: Overview (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents

Dong, Kuicai, Chang, Yujing, Goh, Xin Deik, Li, Dexun, Tang, Ruiming, Liu, Yong

arXiv.org Artificial IntelligenceJan-15-2025

Multi-modal document retrieval is designed to identify and retrieve various forms of multi-modal content, such as figures, tables, charts, and layout information from extensive documents. Despite its significance, there is a notable lack of a robust benchmark to effectively evaluate the performance of systems in multi-modal document retrieval. To address this gap, this work introduces a new benchmark, named as MMDocIR, encompassing two distinct tasks: page-level and layout-level retrieval. The former focuses on localizing the most relevant pages within a long document, while the latter targets the detection of specific layouts, offering a more fine-grained granularity than whole-page analysis. A layout can refer to a variety of elements such as textual paragraphs, equations, figures, tables, or charts. The MMDocIR benchmark comprises a rich dataset featuring expertly annotated labels for 1,685 questions and bootstrapped labels for 173,843 questions, making it a pivotal resource for advancing multi-modal document retrieval for both training and evaluation. Through rigorous experiments, we reveal that (i) visual retrievers significantly outperform their text counterparts, (ii) MMDocIR train set can effectively benefit the training process of multi-modal document retrieval and (iii) text retrievers leveraging on VLM-text perform much better than those using OCR-text. These findings underscores the potential advantages of integrating visual elements for multi-modal document retrieval.

information retrieval, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.08828

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

An Automatic Graph Construction Framework based on Large Language Models for Recommendation

Shan, Rong, Lin, Jianghao, Zhu, Chenxu, Chen, Bo, Zhu, Menghui, Zhang, Kangning, Zhu, Jieming, Tang, Ruiming, Yu, Yong, Zhang, Weinan

arXiv.org Artificial IntelligenceDec-24-2024

Graph neural networks (GNNs) have emerged as state-of-the-art methods to learn from graph-structured data for recommendation. However, most existing GNN-based recommendation methods focus on the optimization of model structures and learning strategies based on pre-defined graphs, neglecting the importance of the graph construction stage. Earlier works for graph construction usually rely on speciffic rules or crowdsourcing, which are either too simplistic or too labor-intensive. Recent works start to utilize large language models (LLMs) to automate the graph construction, in view of their abundant open-world knowledge and remarkable reasoning capabilities. Nevertheless, they generally suffer from two limitations: (1) invisibility of global view (e.g., overlooking contextual information) and (2) construction inefficiency. To this end, we introduce AutoGraph, an automatic graph construction framework based on LLMs for recommendation. Specifically, we first use LLMs to infer the user preference and item knowledge, which is encoded as semantic vectors. Next, we employ vector quantization to extract the latent factors from the semantic vectors. The latent factors are then incorporated as extra nodes to link the user/item nodes, resulting in a graph with in-depth global-view semantics. We further design metapath-based message aggregation to effectively aggregate the semantic and collaborative information. The framework is model-agnostic and compatible with different backbone models. Extensive experiments on three real-world datasets demonstrate the efficacy and efffciency of AutoGraph compared to existing baseline methods. We have deployed AutoGraph in Huawei advertising platform, and gain a 2.69% improvement on RPM and a 7.31% improvement on eCPM in the online A/B test. Currently AutoGraph has been used as the main trafffc model, serving hundreds of millions of people.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2412.18241

Country:

North America (0.46)
Asia > China (0.29)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (1.00)
Information Technology > Services (0.46)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Predictive Models in Sequential Recommendations: Bridging Performance Laws with Data Quality Insights

Shen, Tingjia, Wang, Hao, Wu, Chuhan, Chin, Jin Yao, Guo, Wei, Liu, Yong, Guo, Huifeng, Lian, Defu, Tang, Ruiming, Chen, Enhong

arXiv.org Artificial IntelligenceDec-16-2024

Sequential Recommendation (SR) plays a critical role in predicting users' sequential preferences. Despite its growing prominence in various industries, the increasing scale of SR models incurs substantial computational costs and unpredictability, challenging developers to manage resources efficiently. Under this predicament, Scaling Laws have achieved significant success by examining the loss as models scale up. However, there remains a disparity between loss and model performance, which is of greater concern in practical applications. Moreover, as data continues to expand, it incorporates repetitive and inefficient data. In response, we introduce the Performance Law for SR models, which aims to theoretically investigate and model the relationship between model performance and data quality. Specifically, we first fit the HR and NDCG metrics to transformer-based SR models. Subsequently, we propose Approximate Entropy (ApEn) to assess data quality, presenting a more nuanced approach compared to traditional data quantity metrics. Our method enables accurate predictions across various dataset scales and model sizes, demonstrating a strong correlation in large SR models and offering insights into achieving optimal performance for any given model configuration.

data mining, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.0043

Country:

North America > United States (0.29)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Bridging Relevance and Reasoning: Rationale Distillation in Retrieval-Augmented Generation

Jia, Pengyue, Xu, Derong, Li, Xiaopeng, Du, Zhaocheng, Li, Xiangyang, Zhao, Xiangyu, Wang, Yichao, Wang, Yuhao, Guo, Huifeng, Tang, Ruiming

arXiv.org Artificial IntelligenceDec-11-2024

The reranker and generator are two critical components in the Retrieval-Augmented Generation (i.e., RAG) pipeline, responsible for ranking relevant documents and generating responses. However, due to differences in pre-training data and objectives, there is an inevitable gap between the documents ranked as relevant by the reranker and those required by the generator to support answering the query. To address this gap, we propose RADIO, a novel and practical preference alignment framework with RAtionale DIstillatiOn. Specifically, We first propose a rationale extraction method that leverages the reasoning capabilities of Large Language Models (LLMs) to extract the rationales necessary for answering the query. Subsequently, a rationale-based alignment process is designed to rerank the documents based on the extracted rationales, and fine-tune the reranker to align the preferences. We conduct extensive experiments on two tasks across three datasets to demonstrate the effectiveness of our approach compared to baseline methods. Our code is released online to ease reproduction.

large language model, machine learning, reranker, (20 more...)

arXiv.org Artificial Intelligence

2412.08519

Country: Africa > Nigeria (0.48)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LIBER: Lifelong User Behavior Modeling Based on Large Language Models

Zhu, Chenxu, Quan, Shigang, Chen, Bo, Lin, Jianghao, Cai, Xiaoling, Zhu, Hong, Li, Xiangyang, Xi, Yunjia, Zhang, Weinan, Tang, Ruiming

arXiv.org Artificial IntelligenceNov-21-2024

CTR prediction plays a vital role in recommender systems. Recently, large language models (LLMs) have been applied in recommender systems due to their emergence abilities. While leveraging semantic information from LLMs has shown some improvements in the performance of recommender systems, two notable limitations persist in these studies. First, LLM-enhanced recommender systems encounter challenges in extracting valuable information from lifelong user behavior sequences within textual contexts for recommendation tasks. Second, the inherent variability in human behaviors leads to a constant stream of new behaviors and irregularly fluctuating user interests. This characteristic imposes two significant challenges on existing models. On the one hand, it presents difficulties for LLMs in effectively capturing the dynamic shifts in user interests within these sequences, and on the other hand, there exists the issue of substantial computational overhead if the LLMs necessitate recurrent calls upon each update to the user sequences. In this work, we propose Lifelong User Behavior Modeling (LIBER) based on large language models, which includes three modules: (1) User Behavior Streaming Partition (UBSP), (2) User Interest Learning (UIL), and (3) User Interest Fusion (UIF). Initially, UBSP is employed to condense lengthy user behavior sequences into shorter partitions in an incremental paradigm, facilitating more efficient processing. Subsequently, UIL leverages LLMs in a cascading way to infer insights from these partitions. Finally, UIF integrates the textual outputs generated by the aforementioned processes to construct a comprehensive representation, which can be incorporated by any recommendation model to enhance performance. LIBER has been deployed on Huawei's music recommendation service and achieved substantial improvements in users' play count and play time by 3.01% and 7.69%.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2411.14713

Country:

North America > United States (0.48)
Asia (0.47)

Genre:

Research Report > Experimental Study (0.94)
Research Report > New Finding (0.88)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback