AITopics | Ye, Hai

Collaborating Authors

Ye, Hai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Finding the Sweet Spot: Preference Data Construction for Scaling Preference Optimization

Xiao, Yao, Ye, Hai, Chen, Linyao, Ng, Hwee Tou, Bing, Lidong, Li, Xiaoli, Lee, Roy Ka-wei

arXiv.org Artificial IntelligenceFeb-23-2025

Iterative data generation and model retraining are widely used to align large language models (LLMs). It typically involves a policy model to generate on-policy responses and a reward model to guide training data selection. Direct Preference Optimization (DPO) further enhances this process by constructing preference pairs of chosen and rejected responses. In this work, we aim to \emph{scale up} the number of on-policy samples via repeated random sampling to improve alignment performance. Conventional practice selects the sample with the highest reward as chosen and the lowest as rejected for DPO. However, our experiments reveal that this strategy leads to a \emph{decline} in performance as the sample size increases. To address this, we investigate preference data construction through the lens of underlying normal distribution of sample rewards. We categorize the reward space into seven representative points and systematically explore all 21 ($C_7^2$) pairwise combinations. Through evaluations on four models using AlpacaEval 2, we find that selecting the rejected response at reward position $\mu - 2\sigma$ rather than the minimum reward, is crucial for optimal performance. We finally introduce a scalable preference data construction strategy that consistently enhances model performance as the sample scale increases.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.16825

Country:

Asia (0.68)
North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry: Construction & Engineering (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Test-time Computing: from System-1 Thinking to System-2 Thinking

Ji, Yixin, Li, Juntao, Ye, Hai, Wu, Kaixin, Xu, Jia, Mo, Linjian, Zhang, Min

arXiv.org Artificial IntelligenceJan-5-2025

The remarkable performance of the o1 model in complex reasoning demonstrates that test-time computing scaling can further unlock the model's potential, enabling powerful System-2 thinking. However, there is still a lack of comprehensive surveys for test-time computing scaling. We trace the concept of test-time computing back to System-1 models. In System-1 models, test-time computing addresses distribution shifts and improves robustness and generalization through parameter updating, input modification, representation editing, and output calibration. In System-2 models, it enhances the model's reasoning ability to solve complex problems through repeated sampling, self-correction, and tree search. We organize this survey according to the trend of System-1 to System-2 thinking, highlighting the key role of test-time computing in the transition from System-1 models to weak System-2 models, and then to strong System-2 models. We also point out a few possible future directions.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.02497

Country:

North America > United States (0.46)
Asia > Middle East > UAE (0.14)
Europe > Middle East > Malta (0.14)

Genre: Overview (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

Multi-Agent Sampling: Scaling Inference Compute for Data Synthesis with Tree Search-Based Agentic Collaboration

Ye, Hai, Lin, Mingbao, Ng, Hwee Tou, Yan, Shuicheng

arXiv.org Artificial IntelligenceDec-22-2024

Scaling laws for inference compute in multi-agent systems remain under-explored compared to single-agent scenarios. This work aims to bridge this gap by investigating the problem of data synthesis through multi-agent sampling, where synthetic responses are generated by sampling from multiple distinct language models. Effective model coordination is crucial for successful multi-agent collaboration. Unlike previous approaches that rely on fixed workflows, we treat model coordination as a multi-step decision-making process, optimizing generation structures dynamically for each input question. We introduce Tree Search-based Orchestrated Agents~(TOA), where the workflow evolves iteratively during the sequential sampling process. To achieve this, we leverage Monte Carlo Tree Search (MCTS), integrating a reward model to provide real-time feedback and accelerate exploration. Our experiments on alignment, machine translation, and mathematical reasoning demonstrate that multi-agent sampling significantly outperforms single-agent sampling as inference compute scales. TOA is the most compute-efficient approach, achieving SOTA performance on WMT and a 71.8\% LC win rate on AlpacaEval. Moreover, fine-tuning with our synthesized alignment data surpasses strong preference learning methods on challenging benchmarks such as Arena-Hard and AlpacaEval.

artificial intelligence, llama-3, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.17061

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

On the Robustness of Question Rewriting Systems to Questions of Varying Hardness

Ye, Hai, Ng, Hwee Tou, Han, Wenjuan

arXiv.org Artificial IntelligenceNov-12-2023

In conversational question answering (CQA), the task of question rewriting~(QR) in context aims to rewrite a context-dependent question into an equivalent self-contained question that gives the same answer. In this paper, we are interested in the robustness of a QR system to questions varying in rewriting hardness or difficulty. Since there is a lack of questions classified based on their rewriting hardness, we first propose a heuristic method to automatically classify questions into subsets of varying hardness, by measuring the discrepancy between a question and its rewrite. To find out what makes questions hard or easy for rewriting, we then conduct a human evaluation to annotate the rewriting hardness of questions. Finally, to enhance the robustness of QR systems to questions of varying hardness, we propose a novel learning framework for QR that first trains a QR model independently on each subset of questions of a certain level of hardness, then combines these QR models as one joint model for inference. Experimental results on two datasets show that our framework improves the overall performance compared to the baselines.

artificial intelligence, question rewriting system, varying hardness, (1 more...)

arXiv.org Artificial Intelligence

2311.06807

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (0.73)

Add feedback

Multi-Source Test-Time Adaptation as Dueling Bandits for Extractive Question Answering

Ye, Hai, Xie, Qizhe, Ng, Hwee Tou

arXiv.org Artificial IntelligenceJun-11-2023

In this work, we study multi-source test-time model adaptation from user feedback, where K distinct models are established for adaptation. To allow efficient adaptation, we cast the problem as a stochastic decision-making process, aiming to determine the best adapted model after adaptation. We discuss two frameworks: multi-armed bandit learning and multi-armed dueling bandits. Compared to multi-armed bandit learning, the dueling framework allows pairwise collaboration among K models, which is solved by a novel method named Co-UCB proposed in this work. Experiments on six datasets of extractive question answering (QA) show that the dueling framework using Co-UCB is more effective than other strong baselines for our studied problem.

data mining, machine learning, question answering, (20 more...)

arXiv.org Artificial Intelligence

2306.06779

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.69)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.61)
(2 more...)

Add feedback

Test-Time Adaptation with Perturbation Consistency Learning

Su, Yi, Ji, Yixin, Li, Juntao, Ye, Hai, Zhang, Min

arXiv.org Artificial IntelligenceApr-25-2023

Currently, pre-trained language models (PLMs) do not cope well with the distribution shift problem, resulting in models trained on the training set failing in real test scenarios. To address this problem, the test-time adaptation (TTA) shows great potential, which updates model parameters to suit the test data at the testing time. Existing TTA methods rely on well-designed auxiliary tasks or self-training strategies based on pseudo-label. However, these methods do not achieve good trade-offs regarding performance gains and computational costs. To obtain some insights into such a dilemma, we take two representative TTA methods, i.e., Tent and OIL, for exploration and find that stable prediction is the key to achieving a good balance. Accordingly, in this paper, we propose perturbation consistency learning (PCL), a simple test-time adaptation method to promote the model to make stable predictions for samples with distribution shifts. Extensive experiments on adversarial robustness and cross-lingual transferring demonstrate that our method can achieve higher or comparable performance with less inference time over strong PLM backbones and previous state-of-the-art TTA methods.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2304.12764

Country:

Asia (0.46)
North America (0.28)
Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Robust Question Answering against Distribution Shifts with Test-Time Adaptation: An Empirical Study

Ye, Hai, Ding, Yuyang, Li, Juntao, Ng, Hwee Tou

arXiv.org Artificial IntelligenceFeb-9-2023

A deployed question answering (QA) model can easily fail when the test data has a distribution shift compared to the training data. Robustness tuning (RT) methods have been widely studied to enhance model robustness against distribution shifts before model deployment. However, can we improve a model after deployment? To answer this question, we evaluate test-time adaptation (TTA) to improve a model after deployment. We first introduce COLDQA, a unified evaluation benchmark for robust QA against text corruption and changes in language and domain. We then evaluate previous TTA methods on COLDQA and compare them to RT methods. We also propose a novel TTA method called online imitation learning (OIL). Through extensive experiments, we find that TTA is comparable to RT methods, and applying TTA after RT can significantly boost the performance on COLDQA. Our proposed OIL improves TTA to be more robust to variation in hyper-parameters and test distributions over time.

machine learning, natural language, question answering, (19 more...)

arXiv.org Artificial Intelligence

2302.04618

Genre:

Research Report (0.50)
Instructional Material (0.34)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.71)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Jointly Extracting Relations with Class Ties via Effective Deep Ranking

Ye, Hai, Chao, Wenhan, Luo, Zhunchen, Li, Zhoujun

arXiv.org Artificial IntelligenceAug-5-2017

Connections between relations in relation extraction, which we call class ties, are common. In distantly supervised scenario, one entity tuple may have multiple relation facts. Exploiting class ties between relations of one entity tuple will be promising for distantly supervised relation extraction. However, previous models are not effective or ignore to model this property. In this work, to effectively leverage class ties, we propose to make joint relation extraction with a unified model that integrates convolutional neural network (CNN) with a general pairwise ranking framework, in which three novel ranking loss functions are introduced. Additionally, an effective method is presented to relieve the severe class imbalance problem from NR (not relation) for model training. Experiments on a widely used dataset show that leveraging class ties will enhance extraction and demonstrate the effectiveness of our model to learn class ties. Our model outperforms the baselines significantly, achieving state-of-the-art performance.

class tie, deep learning, neural network, (20 more...)

arXiv.org Artificial Intelligence

1612.07602

Country:

Asia > China (0.14)
North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback