AITopics | Xiong, Haoyi

Collaborating Authors

Xiong, Haoyi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluating LLM-based Agents for Multi-Turn Conversations: A Survey

Guan, Shengyue, Xiong, Haoyi, Wang, Jindong, Bian, Jiang, Zhu, Bin, Lou, Jian-guang

arXiv.org Artificial IntelligenceMar-28-2025

This survey examines evaluation methods for large language model (LLM)-based agents in multi-turn conversational settings. Using a PRISMA-inspired framework, we systematically reviewed nearly 250 scholarly sources, capturing the state of the art from various venues of publication, and establishing a solid foundation for our analysis. Our study offers a structured approach by developing two interrelated taxonomy systems: one that defines \emph{what to evaluate} and another that explains \emph{how to evaluate}. The first taxonomy identifies key components of LLM-based agents for multi-turn conversations and their evaluation dimensions, including task completion, response quality, user experience, memory and context retention, as well as planning and tool integration. These components ensure that the performance of conversational agents is assessed in a holistic and meaningful manner. The second taxonomy system focuses on the evaluation methodologies. It categorizes approaches into annotation-based evaluations, automated metrics, hybrid strategies that combine human assessments with quantitative measures, and self-judging methods utilizing LLMs. This framework not only captures traditional metrics derived from language understanding, such as BLEU and ROUGE scores, but also incorporates advanced techniques that reflect the dynamic, interactive nature of multi-turn dialogues.

computational linguistic, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.22458

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Interpretable Feature Interaction via Statistical Self-supervised Learning on Tabular Data

Zhang, Xiaochen, Xiong, Haoyi

arXiv.org Machine LearningMar-23-2025

In high-dimensional and high-stakes contexts, ensuring both rigorous statistical guarantees and interpretability in feature extraction from complex tabular data remains a formidable challenge. Traditional methods such as Principal Component Analysis (PCA) reduce dimensionality and identify key features that explain the most variance, but are constrained by their reliance on linear assumptions. In contrast, neural networks offer assumption-free feature extraction through self-supervised learning techniques such as autoencoders, though their interpretability remains a challenge in fields requiring transparency. To address this gap, this paper introduces Spofe, a novel self-supervised machine learning pipeline that marries the power of kernel principal components for capturing nonlinear dependencies with a sparse and principled polynomial representation to achieve clear interpretability with statistical rigor. Underpinning our approach is a robust theoretical framework that delivers precise error bounds and rigorous false discovery rate (FDR) control via a multi-objective knockoff selection procedure; it effectively bridges the gap between data-driven complexity and statistical reliability via three stages: (1) generating self-supervised signals using kernel principal components to model complex patterns, (2) distilling these signals into sparse polynomial functions for improved interpretability, and (3) applying a multi-objective knockoff selection procedure with significance testing to rigorously identify important features. Extensive experiments on diverse real-world datasets demonstrate the effectiveness of Spofe, consistently surpassing KPCA, SKPCA, and other methods in feature selection for regression and classification tasks. Visualization and case studies highlight its ability to uncover key insights, enhancing interpretability and practical utility.

artificial intelligence, kernel principal component, machine learning, (18 more...)

arXiv.org Machine Learning

2503.18048

Country:

North America > United States (0.46)
Asia > China (0.28)
Asia > Middle East > Israel (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)

Add feedback

SOLA-GCL: Subgraph-Oriented Learnable Augmentation Method for Graph Contrastive Learning

Peng, Tianhao, Li, Xuhong, Yuan, Haitao, Li, Yuchen, Xiong, Haoyi

arXiv.org Artificial IntelligenceMar-13-2025

Graph contrastive learning has emerged as a powerful technique for learning graph representations that are robust and discriminative. However, traditional approaches often neglect the critical role of subgraph structures, particularly the intra-subgraph characteristics and inter-subgraph relationships, which are crucial for generating informative and diverse contrastive pairs. These subgraph features are crucial as they vary significantly across different graph types, such as social networks where they represent communities, and biochemical networks where they symbolize molecular interactions. To address this issue, our work proposes a novel subgraph-oriented learnable augmentation method for graph contrastive learning, termed SOLA-GCL, that centers around subgraphs, taking full advantage of the subgraph information for data augmentation. Specifically, SOLA-GCL initially partitions a graph into multiple densely connected subgraphs based on their intrinsic properties. To preserve and enhance the unique characteristics inherent to subgraphs, a graph view generator optimizes augmentation strategies for each subgraph, thereby generating tailored views for graph contrastive learning. This generator uses a combination of intra-subgraph and inter-subgraph augmentation strategies, including node dropping, feature masking, intra-edge perturbation, inter-edge perturbation, and subgraph swapping. Extensive experiments have been conducted on various graph learning applications, ranging from social networks to molecules, under semi-supervised learning, unsupervised learning, and transfer learning settings to demonstrate the superiority of our proposed approach over the state-of-the-art in GCL.

artificial intelligence, machine learning, subgraph, (14 more...)

arXiv.org Artificial Intelligence

2503.101

Country:

Asia (0.14)
Oceania > Australia (0.14)

Genre: Research Report (0.64)

Industry: Transportation > Marine (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Knoop: Practical Enhancement of Knockoff with Over-Parameterization for Variable Selection

Zhang, Xiaochen, Cai, Yunfeng, Xiong, Haoyi

arXiv.org Machine LearningJan-28-2025

Variable selection plays a crucial role in enhancing modeling effectiveness across diverse fields, addressing the challenges posed by high-dimensional datasets of correlated variables. This work introduces a novel approach namely Knockoff with over-parameterization (Knoop) to enhance Knockoff filters for variable selection. Specifically, Knoop first generates multiple knockoff variables for each original variable and integrates them with the original variables into an over-parameterized Ridgeless regression model. For each original variable, Knoop evaluates the coefficient distribution of its knockoffs and compares these with the original coefficients to conduct an anomaly-based significance test, ensuring robust variable selection. Extensive experiments demonstrate superior performance compared to existing methods in both simulation and real-world datasets. Knoop achieves a notably higher Area under the Curve (AUC) of the Receiver Operating Characteristic (ROC) Curve for effectively identifying relevant variables against the ground truth by controlled simulations, while showcasing enhanced predictive accuracy across diverse regression and classification tasks. The analytical results further backup our observations.

artificial intelligence, coefficient, machine learning, (15 more...)

arXiv.org Machine Learning

2501.17889

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)

Add feedback

EICopilot: Search and Explore Enterprise Information over Large-scale Knowledge Graphs with LLM-driven Agents

Yun, Yuhui, Ye, Huilong, Li, Xinru, Li, Ruojia, Deng, Jingfeng, Li, Li, Xiong, Haoyi

arXiv.org Artificial IntelligenceJan-23-2025

The paper introduces EICopilot, an novel agent-based solution enhancing search and exploration of enterprise registration data within extensive online knowledge graphs like those detailing legal entities, registered capital, and major shareholders. Traditional methods necessitate text-based queries and manual subgraph explorations, often resulting in time-consuming processes. EICopilot, deployed as a chatbot via Baidu Enterprise Search, improves this landscape by utilizing Large Language Models (LLMs) to interpret natural language queries. This solution automatically generates and executes Gremlin scripts, providing efficient summaries of complex enterprise relationships. Distinct feature a data pre-processing pipeline that compiles and annotates representative queries into a vector database of examples for In-context learning (ICL), a comprehensive reasoning pipeline combining Chain-of-Thought with ICL to enhance Gremlin script generation for knowledge graph search and exploration, and a novel query masking strategy that improves intent recognition for heightened script accuracy. Empirical evaluations demonstrate the superior performance of EICopilot, including speed and accuracy, over baseline methods, with the \emph{Full Mask} variant achieving a syntax error rate reduction to as low as 10.00% and an execution correctness of up to 82.14%. These components collectively contribute to superior querying capabilities and summarization of intricate datasets, positioning EICopilot as a groundbreaking tool in the exploration and exploitation of large-scale knowledge graphs for enterprise information search.

information retrieval, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2501.13746

Country: Asia > China > Hubei Province (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflection

Bai, Lichen, Shao, Shitong, Zhou, Zikai, Qi, Zipeng, Xu, Zhiqiang, Xiong, Haoyi, Xie, Zeke

arXiv.org Artificial IntelligenceDec-17-2024

Style: Position: Color: Counting: Text: Object co-occurrence: A man is cooking, A sheep to the right of a A photo of a yellow dining A photo of two bears A sign that says'Diffusion'. Figure 1: The qualitative results of Z-Sampling demonstrate the effectiveness of our method in various aspects, such as style, position, color, counting, text rendering, and object co-occurrence. Diffusion models, the most popular generative paradigm so far, can inject conditional information into the generation path to guide the latent towards desired directions. However, existing text-to-image diffusion models often fail to maintain high image quality and high prompt-image alignment for those challenging prompts. To mitigate this issue and enhance existing pretrained diffusion models, we mainly made three contributions in this paper. First, we propose diffusion self-reflection that alternately performs denoising and inversion and demonstrate that such diffusion self-reflection can leverage the guidance gap between denoising and inversion to capture prompt-related semantic information with theoretical and empirical evidence. Second, motivated by theoretical analysis, we derive Zigzag Diffusion Sampling (Z-Sampling), a novel self-reflection-based diffusion sampling method that leverages the guidance gap between denosing and inversion to accumulate semantic information step by step along the sampling path, leading to improved sampling results. Moreover, as a plug-and-play method, Z-Sampling can be generally applied to various diffusion models (e.g., accelerated ones and Transformer-based ones) with very limited coding and computational costs. Third, our extensive experiments demonstrate that Z-Sampling can generally and significantly enhance generation quality across various benchmark datasets, diffusion models, and performance evaluation metrics. Moreover, Z-Sampling can further enhance existing diffusion models combined with other orthogonal methods, including Diffusion-DPO. One key ability of diffusion models is to guide the sampling path based on some conditions (e.g., texts), leading to conditional or controllable generation (Ho & Salimans, 2022). However, while strong guidance may improve semantic alignment to those challenging prompts, it often causes significant decline in image fidelity, leading to mode collapse, and resulting inevitable accumulation of errors during the sampling process (Chung et al., 2024). To mitigate this issue, some studies apply additional manifold constraints to the sampling paths (Chung et al., 2024; Yang et al.;

large language model, machine learning, z-sampling, (19 more...)

arXiv.org Artificial Intelligence

2412.10891

Genre: Research Report (0.81)

Industry:

Leisure & Entertainment (0.67)
Media > Photography (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.87)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Pre-trained Molecular Language Models with Random Functional Group Masking

Peng, Tianhao, Li, Yuchen, Li, Xuhong, Bian, Jiang, Xie, Zeke, Sui, Ning, Mumtaz, Shahid, Xu, Yanwu, Kong, Linghe, Xiong, Haoyi

arXiv.org Artificial IntelligenceNov-2-2024

Recent advancements in computational chemistry have leveraged the power of trans-former-based language models, such as MoLFormer, pre-trained using a vast amount of simplified molecular-input line-entry system (SMILES) sequences, to understand and predict molecular properties and activities, a critical step in fields like drug discovery and materials science. To further improve performance, researchers have introduced graph neural networks with graph-based molecular representations, such as GEM, incorporating the topology, geometry, 2D or even 3D structures of molecules into pre-training. While most of molecular graphs in existing studies were automatically converted from SMILES sequences, it is to assume that transformer-based language models might be able to implicitly learn structure-aware representations from SMILES sequences. In this paper, we propose \ours{} -- a SMILES-based \underline{\em M}olecular \underline{\em L}anguage \underline{\em M}odel, which randomly masking SMILES subsequences corresponding to specific molecular \underline{\em F}unctional \underline{\em G}roups to incorporate structure information of atoms during the pre-training phase. This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities. Extensive experimental evaluations across 11 benchmark classification and regression tasks in the chemical domain demonstrate the robustness and superiority of \ours{}. Our findings reveal that \ours{} outperforms existing pre-training models, either based on SMILES or graphs, in 9 out of the 11 downstream tasks, ranking as a close second in the remaining ones.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.01401

Country:

Asia > China (0.94)
North America > United States (0.93)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

IV-Mixed Sampler: Leveraging Image Diffusion Models for Enhanced Video Synthesis

Shao, Shitong, Zhou, Zikai, Bai, Lichen, Xiong, Haoyi, Xie, Zeke

arXiv.org Artificial IntelligenceOct-7-2024

The multi-step sampling mechanism, a key feature of visual diffusion models, has significant potential to replicate the success of OpenAI's Strawberry in enhancing performance by increasing the inference computational cost. Sufficient prior studies have demonstrated that correctly scaling up computation in the sampling process can successfully lead to improved generation quality, enhanced image editing, and compositional generalization. While there have been rapid advancements in developing inference-heavy algorithms for improved image generation, relatively little work has explored inference scaling laws in video diffusion models (VDMs). Furthermore, existing research shows only minimal performance gains that are perceptible to the naked eye. To address this, we design a novel training-free algorithm IV-Mixed Sampler that leverages the strengths of image diffusion models (IDMs) to assist VDMs surpass their current capabilities. The core of IV-Mixed Sampler is to use IDMs to significantly enhance the quality of each video frame and VDMs ensure the temporal coherence of the video during the sampling process. Our experiments have demonstrated that IV-Mixed Sampler achieves state-of-the-art performance on 4 benchmarks including UCF-101-FVD, MSR-VTT-FVD, Chronomagic-Bench-150, and Chronomagic-Bench-1649. For example, the open-source Animatediff with IV-Mixed Sampler reduces the UMT-FVD score from 275.2 to 228.6, closing to 223.1 from the closed-source Pika-2.0.

artificial intelligence, leveraging image diffusion model, machine learning, (2 more...)

arXiv.org Artificial Intelligence

2410.04171

Genre: Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.53)

Add feedback

Pre-trained Graphformer-based Ranking at Web-scale Search (Extended Abstract)

Li, Yuchen, Xiong, Haoyi, Kong, Linghe, Sun, Zeyi, Chen, Hongyang, Wang, Shuaiqiang, Yin, Dawei

arXiv.org Artificial IntelligenceSep-24-2024

Although graphformer[Yang et al., 2021] has been proposed to combine advantages from GNNs and Both Transformer and Graph Neural Networks Transformers for representation learning with textual graphs, (GNNs) have been employed in the domain of learning there still lack of joint efforts from the two domains (i.e., to rank (LTR). However, these approaches adhere query-webpage pairs and graphs) in LTR. In order to improve to two distinct yet complementary problem the performance of over-parameterized models like formulations: ranking score regression based on Transformers or GNNs, the paradigm of pre-training and query-webpage pairs, and link prediction within fine-tuning has been extensively employed[Liao et al., 2024; query-webpage bipartite graphs, respectively. While Chen et al., 2024g; Chen et al., 2022; Song et al., 2024; it is possible to pre-train GNNs or Transformers on Lyu et al., 2023]. This involves firstly training the models source datasets and subsequently fine-tune them on on large-scale source datasets in an unsupervised or selfsupervised sparsely annotated LTR datasets, the distributional manner to develop their core representation learning shifts between the pair-based and bipartite graph capabilities [Qiang et al., 2023; Xiong et al., 2024a; domains present significant challenges in integrating Xiong et al., 2024b; Lyu et al., 2020]. Subsequently, the pretrained these heterogeneous models into a unified LTR models can be fine-tuned using a small number of annotated framework at web scale. To address this, we introduce samples from the target datasets [Kirichenko et al., 2022; the novel MPGraf model, which leverages Huang et al., 2021; Chen et al., 2023e; Chen et al., 2023d; a modular and capsule-based pre-training strategy, Chen et al., 2023b]. However, such paradigm could not be aiming to cohesively integrate the regression capabilities easily followed by the LTR models leveraging both querywebpage of Transformers with the link prediction pairs and graphs together.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2409.1659

Country: Asia > China (0.28)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.94)
Information Technology (0.68)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Generative Pre-trained Ranking Model with Over-parameterization at Web-Scale (Extended Abstract)

Li, Yuchen, Xiong, Haoyi, Kong, Linghe, Bian, Jiang, Wang, Shuaiqiang, Chen, Guihai, Yin, Dawei

arXiv.org Artificial IntelligenceSep-24-2024

Learning to rank (LTR) is widely employed in web The optimization of the user experience, achieved by catering searches to prioritize pertinent webpages from retrieved to information needs, largely depends on the effective content based on input queries. However, sorting of retrieved content. In this realm, Learning to Rank traditional LTR models encounter two principal obstacles (LTR) becomes instrumental, requiring a considerable amount that lead to suboptimal performance: (1) the of query-webpage pairings with relevancy scores for effective lack of well-annotated query-webpage pairs with supervised LTR [Li et al., 2023b; Qin and Liu, 2013; ranking scores covering a diverse range of search Li et al., 2023c; Lyu et al., 2020; Peng et al., 2024; query popularities, which hampers their ability to Wang et al., 2024b]. Nevertheless, the commonplace scarcity address queries across the popularity spectrum, and of well-described, query-webpage pairings often compels (2) inadequately trained models that fail to induce semi-supervised LTR, harnessing both labeled and unlabeled generalized representations for LTR, resulting in samples for the process [Szummer and Yilmaz, 2011; overfitting. To address these challenges, we propose Zhang et al., 2016; Zhu et al., 2023; Peng et al., 2023].

gs 2, machine learning, question answering, (17 more...)

arXiv.org Artificial Intelligence

2409.16594

Country: Asia > China (0.28)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.48)

Add feedback