AITopics | Qiu, Xihe

Collaborating Authors

Qiu, Xihe

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AURORA:Automated Training Framework of Universal Process Reward Models via Ensemble Prompting and Reverse Verification

Tan, Xiaoyu, Yao, Tianchu, Qu, Chao, Li, Bin, Yang, Minghao, Lu, Dakuan, Wang, Haozhe, Qiu, Xihe, Chu, Wei, Xu, Yinghui, Qi, Yuan

arXiv.org Artificial IntelligenceFeb-17-2025

The reasoning capabilities of advanced large language models (LLMs) like o1 have revolutionized artificial intelligence applications. Nevertheless, evaluating and optimizing complex reasoning processes remain significant challenges due to diverse policy distributions and the inherent limitations of human effort and accuracy. In this paper, we present AURORA, a novel automated framework for training universal process reward models (PRMs) using ensemble prompting and reverse verification. The framework employs a two-phase approach: First, it uses diverse prompting strategies and ensemble methods to perform automated annotation and evaluation of processes, ensuring robust assessments for reward learning. Second, it leverages practical reference answers for reverse verification, enhancing the model's ability to validate outputs and improving training accuracy. To assess the framework's performance, we extend beyond the existing ProcessBench benchmark by introducing UniversalBench, which evaluates reward predictions across full trajectories under diverse policy distribtion with long Chain-of-Thought (CoT) outputs. Experimental results demonstrate that AURORA enhances process evaluation accuracy, improves PRMs' accuracy for diverse policy distributions and long-CoT responses. The project will be open-sourced at https://auroraprm.github.io/. The Universal-PRM-7B is available at https://huggingface.co/infly/Universal-PRM-7B.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.1152

Genre: Research Report > New Finding (0.87)

Industry: Education (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

An Attentive Dual-Encoder Framework Leveraging Multimodal Visual and Semantic Information for Automatic OSAHS Diagnosis

Wei, Yingchen, Qiu, Xihe, Tan, Xiaoyu, Huang, Jingjing, Chu, Wei, Xu, Yinghui, Qi, Yuan

arXiv.org Artificial IntelligenceDec-25-2024

Obstructive sleep apnea-hypopnea syndrome (OSAHS) [1] Our key contributions are as follows: (1) Introducing VTA-affects about 27% of adults [2], causing poor sleep, daytime OSAHS, a multimodal framework for diagnosing OSAHS dysfunction, and higher risks of cardiovascular diseases and diabetes severity by combining visual and language data, and using [3]. The standard diagnostic method, polysomnography a pre-trained language model to extract key information from (PSG) [4], is complex, costly, and uncomfortable, requiring basic physiological data for improved classification accuracy; multi-channel monitoring (EEG, ECG, heart rate [5]) and (2) Developing a visual encoder that focuses on specific facial trained technicians (Figure 1). Data-driven methods for automated features associated with OSAHS, employing attention mesh OSAHS diagnosis can improve efficiency and reduce and stochastic gates for better clinical decision alignment; (3) costs. Facial features like a flat nasal bridge, wide jawbone, Implementing a data pre-processing strategy to handle imbalanced thick neck, and mandibular retrognathia correlate with OSAHS samples and ordinal classification, using randomOver-severity [6], providing visual indicators of airway obstruction Sampler (ROS) [17] and an ordinal regression loss function and sleep disturbances. Deep learning can analyze these features [18] to enhance accuracy and robustness; (4) Demonstrating for early diagnosis and personalized treatment.

accuracy, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.18919

Country: Asia > China (0.17)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Adaptive Learning on User Segmentation: Universal to Specific Representation via Bipartite Neural Interaction

Tan, Xiaoyu, Deng, Yongxin, Qu, Chao, Xue, Siqiao, Shi, Xiaoming, Zhang, James, Qiu, Xihe

arXiv.org Artificial IntelligenceSep-23-2024

Recently, models for user representation learning have been widely applied in click-through-rate (CTR) and conversion-rate (CVR) prediction. Usually, the model learns a universal user representation as the input for subsequent scenario-specific models. However, in numerous industrial applications (e.g., recommendation and marketing), the business always operates such applications as various online activities among different user segmentation. These segmentation are always created by domain experts. Due to the difference in user distribution (i.e., user segmentation) and business objectives in subsequent tasks, learning solely on universal representation may lead to detrimental effects on both model performance and robustness. In this paper, we propose a novel learning framework that can first learn general universal user representation through information bottleneck. Then, merge and learn a segmentation-specific or a task-specific representation through neural interaction. We design the interactive learning process by leveraging a bipartite graph architecture to model the representation learning and merging between contextual clusters and each user segmentation. Our proposed method is evaluated in two open-source benchmarks, two offline business datasets, and deployed on two online marketing applications to predict users' CVR. The results demonstrate that our method can achieve superior performance and surpass the baseline methods.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3624918.3625323

2409.14945

Country:

Asia > China (0.16)
North America > United States (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Marketing (1.00)
Information Technology > Services (0.68)
Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

FedSlate:A Federated Deep Reinforcement Learning Recommender System

Deng, Yongxin, Tan, Xiaoyu, Qiu, Xihe, Jin, Yaochu

arXiv.org Artificial IntelligenceSep-23-2024

Reinforcement learning methods have been used to optimize long-term user engagement in recommendation systems. However, existing reinforcement learning-based recommendation systems do not fully exploit the relevance of individual user behavior across different platforms. One potential solution is to aggregate data from various platforms in a centralized location and use the aggregated data for training. However, this approach raises economic and legal concerns, including increased communication costs and potential threats to user privacy. To address these challenges, we propose \textbf{FedSlate}, a federated reinforcement learning recommendation algorithm that effectively utilizes information that is prohibited from being shared at a legal level. We employ the SlateQ algorithm to assist FedSlate in learning users' long-term behavior and evaluating the value of recommended content. We extend the existing application scope of recommendation systems from single-user single-platform to single-user multi-platform and address cross-platform learning challenges by introducing federated learning. We use RecSim to construct a simulation environment for evaluating FedSlate and compare its performance with state-of-the-art benchmark recommendation models. Experimental results demonstrate the superior effects of FedSlate over baseline methods in various environmental settings, and FedSlate facilitates the learning of recommendation strategies in scenarios where baseline methods are completely inapplicable. Code is available at \textit{https://github.com/TianYaDY/FedSlate}.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2409.14872

Country:

Asia (1.00)
North America > United States > Maryland (0.14)
North America > United States > California (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Services (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Struct-X: Enhancing Large Language Models Reasoning with Structured Data

Tan, Xiaoyu, Wang, Haoyu, Qiu, Xihe, Cheng, Yuan, Xu, Yinghui, Chu, Wei, Qi, Yuan

arXiv.org Artificial IntelligenceJul-17-2024

Structured data, rich in logical and relational information, has the potential to enhance the reasoning abilities of large language models (LLMs). Still, its integration poses a challenge due to the risk of overwhelming LLMs with excessive tokens and irrelevant context information. To address this, we propose Struct-X, a novel framework that operates through five key phases: ``read-model-fill-reflect-reason'' efficiently enabling LLMs to utilize structured data. It begins by encoding structured data into a topological space using graph embeddings, followed by filling in missing entity information with knowledge retrieval modules, and filtering out irrelevant tokens via a self-supervised module. The final phase involves constructing a topological network with selected tokens to further reduce the total token length for more effective LLM inference. Additionally, Struct-X includes an Auxiliary Module trained to generate prompts, aiding LLMs in analyzing structured data. Extensive experiments on benchmarks, including the knowledge graph question-answer task and the long document reading comprehension task, show that Struct-X notably improves LLM reasoning, demonstrating the effectiveness of structured data augmentation in improving LLM inference with complex input context.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2407.12522

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Industry:

Energy (0.46)
Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Towards Collaborative Intelligence: Propagating Intentions and Reasoning for Multi-Agent Coordination with Large Language Models

Qiu, Xihe, Wang, Haoyu, Tan, Xiaoyu, Qu, Chao, Xiong, Yujie, Cheng, Yuan, Xu, Yinghui, Chu, Wei, Qi, Yuan

arXiv.org Artificial IntelligenceJul-17-2024

Effective collaboration in multi-agent systems requires communicating goals and intentions between agents. Current agent frameworks often suffer from dependencies on single-agent execution and lack robust inter-module communication, frequently leading to suboptimal multi-agent reinforcement learning (MARL) policies and inadequate task coordination. To address these challenges, we present a framework for training large language models (LLMs) as collaborative agents to enable coordinated behaviors in cooperative MARL. Each agent maintains a private intention consisting of its current goal and associated sub-tasks. Agents broadcast their intentions periodically, allowing other agents to infer coordination tasks. A propagation network transforms broadcast intentions into teammate-specific communication messages, sharing relevant goals with designated teammates. The architecture of our framework is structured into planning, grounding, and execution modules. During execution, multiple agents interact in a downstream environment and communicate intentions to enable coordinated behaviors. The grounding module dynamically adapts comprehension strategies based on emerging coordination patterns, while feedback from execution agents influnces the planning module, enabling the dynamic re-planning of sub-tasks. Results in collaborative environment simulation demonstrate intention propagation reduces miscoordination errors by aligning sub-task dependencies between agents. Agents learn when to communicate intentions and which teammates require task details, resulting in emergent coordinated behaviors. This demonstrates the efficacy of intention sharing for cooperative multi-agent RL based on LLMs.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2407.12532

Country: Asia > China (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.86)

Add feedback

Subequivariant Reinforcement Learning Framework for Coordinated Motion Control

Wang, Haoyu, Tan, Xiaoyu, Qiu, Xihe, Qu, Chao

arXiv.org Artificial IntelligenceMar-22-2024

Effective coordination is crucial for motion control with reinforcement learning, especially as the complexity of agents and their motions increases. However, many existing methods struggle to account for the intricate dependencies between joints. We introduce CoordiGraph, a novel architecture that leverages subequivariant principles from physics to enhance coordination of motion control with reinforcement learning. This method embeds the principles of equivariance as inherent patterns in the learning process under gravity influence, which aids in modeling the nuanced relationships between joints vital for motion control. Through extensive experimentation with sophisticated agents in diverse environments, we highlight the merits of our approach. Compared to current leading methods, CoordiGraph notably enhances generalization and sample efficiency.

machine learning, reinforcement, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2403.151

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.47)

Industry: Health & Medicine > Epidemiology (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback