AITopics | Ton, Jean-Francois

Collaborating Authors

Ton, Jean-Francois

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs

Sun, Hao, Shen, Yunyi, Ton, Jean-Francois, van der Schaar, Mihaela

arXiv.org Artificial IntelligenceFeb-4-2025

Large Language Models (LLMs) have made substantial strides in structured tasks through Reinforcement Learning (RL), demonstrating proficiency in mathematical reasoning and code generation. However, applying RL in broader domains like chatbots and content generation -- through the process known as Reinforcement Learning from Human Feedback (RLHF) -- presents unique challenges. Reward models in RLHF are critical, acting as proxies that evaluate the alignment of LLM outputs with human intent. Despite advancements, the development of reward models is hindered by challenges such as computational heavy training, costly evaluation, and therefore poor reproducibility. We advocate for using embedding-based input in reward model research as an accelerated solution to those challenges. By leveraging embeddings for reward modeling, we can enhance reproducibility, reduce computational demands on hardware, improve training stability, and significantly reduce training and evaluation costs, hence facilitating fair and efficient comparisons in this active research area. We then show a case study of reproducing existing reward model ensemble research using embedding-based reward models. We discussed future avenues for research, aiming to contribute to safer and more effective LLM deployments.

large language model, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2502.04357

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding Chain-of-Thought in LLMs through Information Theory

Ton, Jean-Francois, Taufiq, Muhammad Faaiz, Liu, Yang

arXiv.org Artificial IntelligenceNov-18-2024

Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks, from complex reasoning to code generation [Chowdhery et al., 2024, OpenAI et al., 2024, Bubeck et al., 2023, Anil et al., 2023]. Many of these advances can be attributed to Chain-of-Thought (CoT) reasoning [Wei et al., 2024, Nye et al., 2021, Li et al., 2024], which involves breaking down complex problems into a series of intermediate steps, mirroring human-like reasoning processes. The success of CoT reasoning, particularly in domains such as mathematics, logic, and multi-step decision-making, has led researchers and developers to incorporate CoT-like features directly into model training, i.e. the FLAN family of models [Chung et al., 2022, Wei et al., 2022]. This paper introduces a new formal framework for analyzing CoT in LLMs. We provide a rigorous method grounded in information theory, to evaluate the quality of each step in a model's reasoning process, thus offering insights beyond simple accuracy metrics to identify areas for improvement.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.11984

Country: North America > United States > California (0.14)

Genre:

Workflow (1.00)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives

Sun, Hao, Shen, Yunyi, Ton, Jean-Francois

arXiv.org Artificial IntelligenceNov-7-2024

The Bradley-Terry (BT) model is a common and successful practice in reward modeling for Large Language Model (LLM) alignment. However, it remains unclear why this model -- originally developed for multi-player stochastic game matching -- can be adopted to convert pairwise response comparisons to reward values and make predictions. Especially given the fact that only a limited number of prompt-response pairs are sparsely compared with others. In this paper, we first revisit the foundations of using BT models in reward modeling, and establish the convergence rate of BT reward models based on deep neural networks using embeddings, providing a theoretical foundation for their use. Despite theoretically sound, we argue that the BT model is not a necessary choice from the perspective of downstream optimization. This is because a reward model only needs to preserve the correct ranking predictions through a monotonic transformation of the true reward. We highlight the critical concept of order consistency in reward modeling and demonstrate that the BT model possesses this property. Consequently, we propose a simple and straightforward upper-bound algorithm, compatible with off-the-shelf binary classifiers, as an alternative order-consistent reward modeling objective. To offer practical insights, we empirically evaluate the performance of these different reward modeling approaches across more than 12,000 experimental setups, using $6$ base LLMs, $2$ datasets, and diverse annotation designs that vary in quantity, quality, and pairing choices in preference annotations.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2411.04991

Country:

North America > United States > Massachusetts (0.27)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.27)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ACC-Debate: An Actor-Critic Approach to Multi-Agent Debate

Estornell, Andrew, Ton, Jean-Francois, Yao, Yuanshun, Liu, Yang

arXiv.org Artificial IntelligenceNov-4-2024

Large language models (LLMs) have demonstrated a remarkable ability to serve as general-purpose tools for various language-based tasks. Recent works have demonstrated that the efficacy of such models can be improved through iterative dialog between multiple models, frequently referred to as multi-agent debate (MAD). While debate shows promise as a means of improving model efficacy, most works in this area treat debate as an emergent behavior, rather than a learned behavior. In doing so, current debate frameworks rely on collaborative behaviors to have been sufficiently trained into off-the-shelf models. To address this limitation, we propose ACC-Debate, an Actor-Critic based learning framework to produce a two-agent team specialized in debate. We demonstrate that ACC-Debate outperforms SotA debate techniques on a wide array of benchmarks. Recently, large language models (LLMs) have rapidly become a cornerstone in various applications, redefining how we process and generate language at scale (Thirunavukarasu et al., 2023; Hadi et al., 2023; Jiang et al., 2024). Their ability to handle diverse tasks, from translation (Zhu et al., 2024; Otter et al., 2020) to answering complex questions (Zhang et al., 2024; Hao et al., 2024; Havrilla et al., 2024), has attracted the attention of both industry as well as academia. However, despite these advancements, LLMs still exhibit notable weaknesses, particularly when it comes to answering factual questions and reasoning (Tonmoy et al., 2024; Rawte et al., 2023; Huang et al., 2023). To address these limitations, several techniques have been proposed, such as Chain-of-Thought (CoT) prompting (Wei et al., 2022), self-reflection (Ji et al., 2023; Shinn et al., 2023), and multiagent debate (MAD) (Du et al., 2023), to name a few. These approaches aim to improve the reasoning abilities of LLMs by guiding them toward more accurate answers through structured thinking or discourse. However, the majority of these techniques do not involve training the model specifically for these tasks but instead rely on zero-shot or few-shot capabilities.

arxiv preprint arxiv, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2411.00053

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Education (0.68)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation

Zhang, Xiaoying, Ton, Jean-Francois, Shen, Wei, Wang, Hongning, Liu, Yang

arXiv.org Artificial IntelligenceJul-9-2024

We introduce Adversarial Policy Optimization (AdvPO), a novel solution to the pervasive issue of reward over-optimization in Reinforcement Learning from Human Feedback (RLHF) for Large Language Models (LLMs). Over-optimization occurs when a reward model serves as an imperfect proxy for human preference, and RL-driven policy optimization erroneously exploits reward inaccuracies. In this paper, we begin by introducing a lightweight way to quantify uncertainties in rewards, relying solely on the last layer embeddings of the reward model, without the need for computationally expensive reward ensembles. AdvPO then addresses a distributionally robust optimization problem centred around the confidence interval of the reward model's predictions for policy improvement. Through comprehensive experiments on the Anthropic HH and TL;DR summarization datasets, we illustrate the efficacy of AdvPO in mitigating the overoptimization issue, consequently resulting in enhanced performance as evaluated through human-assisted evaluation.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2403.05171

Country: Asia (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dataset Fairness: Achievable Fairness on Your Data With Utility Guarantees

Taufiq, Muhammad Faaiz, Ton, Jean-Francois, Liu, Yang

arXiv.org Machine LearningFeb-26-2024

One of the key challenges in fairness for machine learning is to train models that minimize the disparity across various sensitive groups such as race or gender [Caton and Haas, 2020, Ustun et al., 2019, Celis et al., 2019]. This often comes at the cost of reduced model accuracy, a phenomenon termed accuracyfairness trade-off in literature [Valdivia et al., 2021, Martinez et al., 2020]. This trade-off can differ significantly across datasets in practice, depending on factors such as dataset biases, imbalances etc. [Agarwal et al., 2018, Bendekgey and Sudderth, 2021, Celis et al., 2021]. To demonstrate how these trade-offs are inherently dataset-dependent, let's consider a simple example involving two distinct crime datasets. Dataset A has records from a community where crime rates are uniformly distributed across all racial groups, whereas Dataset B comes from a community where historical factors have resulted in a disproportionate crime rate among a specific racial group. Intuitively, training models which are racially agnostic is more challenging for Dataset B, due to the unequal distribution of crime rates across racial groups, and will result in a greater loss in model accuracy as compared to Dataset A. This example underscores that setting a uniform fairness requirement across diverse datasets (such as requiring the fairness violation metric to be below 10% for both datasets), while also adhering to essential accuracy benchmarks is impractical.

artificial intelligence, machine learning, trade-off, (16 more...)

arXiv.org Machine Learning

2402.17106

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.93)

Industry: Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.88)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting

Wei, Jiaheng, Yao, Yuanshun, Ton, Jean-Francois, Guo, Hongyi, Estornell, Andrew, Liu, Yang

arXiv.org Artificial IntelligenceFeb-15-2024

LLM is known to provide factually inaccurate information that appears to be confident, i.e. hallucination. It is currently a major obstacle to the reliability and trustworthiness of LLM [13, 34, 21]. An essential step towards solving this problem is measuring hallucinations. However, this is challenging from a data perspective as existing metrics presume that benchmark datasets posses gold-standard answers, i.e. "best" or "correct" answers written by humans [16]. The requirement of such answers imposes two fundamental limitations on hallucination measurement: 1) hiring human annotators to produce gold-standard answers is costly in both time and money [4, 43, 38]; 2) gold-standard answers are prone to natural human errors [7, 6, 49]. To this end, we take a step forward and propose a framework which measures the LLM hallucinations without the requirement of gold-standard answers. Our framework is partially inspired by the literature on learning with noisy labels [23, 18, 19], where there are no ground-truth labels for verifying the quality of imperfect human annotations [43, 38, 20], detecting annotation errors [48, 26, 47], or training models robustly [42, 3, 17, 36, 39]. Our basic idea is simple: leveraging off-the-shelf and high-quality LLMs to generate answers that serve as a proxy for gold-standard answers. The primary challenge in such an approach is how to properly weigh the expertise of each LLM for a given question x, without a priori knowledge of the true (i.e.

answer, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2402.10412

Country:

North America > United States > Illinois (0.14)
North America > United States > Nevada (0.14)
North America > United States > Arkansas (0.14)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Media > Television (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Invariant Learning via Probability of Sufficient and Necessary Causes

Yang, Mengyue, Fang, Zhen, Zhang, Yonggang, Du, Yali, Liu, Furui, Ton, Jean-Francois, Wang, Jianhong, Wang, Jun

arXiv.org Artificial IntelligenceDec-21-2023

Out-of-distribution (OOD) generalization is indispensable for learning models in the wild, where testing distribution typically unknown and different from the training. Recent methods derived from causality have shown great potential in achieving OOD generalization. However, existing methods mainly focus on the invariance property of causes, while largely overlooking the property of \textit{sufficiency} and \textit{necessity} conditions. Namely, a necessary but insufficient cause (feature) is invariant to distribution shift, yet it may not have required accuracy. By contrast, a sufficient yet unnecessary cause (feature) tends to fit specific data well but may have a risk of adapting to a new domain. To capture the information of sufficient and necessary causes, we employ a classical concept, the probability of sufficiency and necessary causes (PNS), which indicates the probability of whether one is the necessary and sufficient cause. To associate PNS with OOD generalization, we propose PNS risk and formulate an algorithm to learn representation with a high PNS value. We theoretically analyze and prove the generalizability of the PNS risk. Experiments on both synthetic and real-world benchmarks demonstrate the effectiveness of the proposed method. The details of the implementation can be found at the GitHub repository: https://github.com/ymy4323460/CaSN.

artificial intelligence, machine learning, representation, (14 more...)

arXiv.org Artificial Intelligence

2309.12559

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Marginal Density Ratio for Off-Policy Evaluation in Contextual Bandits

Taufiq, Muhammad Faaiz, Doucet, Arnaud, Cornish, Rob, Ton, Jean-Francois

arXiv.org Machine LearningDec-3-2023

Off-Policy Evaluation (OPE) in contextual bandits is crucial for assessing new policies using existing data without costly experimentation. However, current OPE methods, such as Inverse Probability Weighting (IPW) and Doubly Robust (DR) estimators, suffer from high variance, particularly in cases of low overlap between target and behavior policies or large action and context spaces. In this paper, we introduce a new OPE estimator for contextual bandits, the Marginal Ratio (MR) estimator, which focuses on the shift in the marginal distribution of outcomes $Y$ instead of the policies themselves. Through rigorous theoretical analysis, we demonstrate the benefits of the MR estimator compared to conventional methods like IPW and DR in terms of variance reduction. Additionally, we establish a connection between the MR estimator and the state-of-the-art Marginalized Inverse Propensity Score (MIPS) estimator, proving that MR achieves lower variance among a generalized family of MIPS estimators. We further illustrate the utility of the MR estimator in causal inference settings, where it exhibits enhanced performance in estimating Average Treatment Effects (ATE). Our experiments on synthetic and real-world datasets corroborate our theoretical findings and highlight the practical advantages of the MR estimator in OPE for contextual bandits.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

2312.01457

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Deep Concept Removal

Klochkov, Yegor, Ton, Jean-Francois, Guo, Ruocheng, Liu, Yang, Li, Hang

arXiv.org Artificial IntelligenceOct-9-2023

We address the problem of concept removal in deep neural networks, aiming to learn representations that do not encode certain specified concepts (e.g., gender etc.) We propose a novel method based on adversarial linear classifiers trained on a concept dataset, which helps to remove the targeted attribute while maintaining model performance. Our approach Deep Concept Removal incorporates adversarial probing classifiers at various layers of the network, effectively addressing concept entanglement and improving out-of-distribution generalization. We also introduce an implicit gradient-based technique to tackle the challenges associated with adversarial training using linear classifiers. We evaluate the ability to remove a concept on a set of popular distributionally robust optimization (DRO) benchmarks with spurious correlations, as well as out-of-distribution (OOD) generalization tasks.

artificial intelligence, machine learning, representation, (17 more...)

arXiv.org Artificial Intelligence

2310.05755

Country: Europe > United Kingdom (0.14)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback