AITopics

2502.09609

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Generation (0.60)

arXiv.org Artificial IntelligenceFeb-4-2025

Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search

Shen, Maohao, Zeng, Guangtao, Qi, Zhenting, Hong, Zhang-Wei, Chen, Zhenfang, Lu, Wei, Wornell, Gregory, Das, Subhro, Cox, David, Gan, Chuang

Large language models (LLMs) have demonstrated remarkable Large language models (LLMs) have demonstrated performance across a wide range of reasoning remarkable reasoning capabilities across tasks, including mathematical problems (Cobbe et al., 2021; diverse domains. Recent studies have shown that Hendrycks et al., 2021a), programming (Chen et al., 2021; increasing test-time computation enhances LLMs' Zhuo et al., 2024) and logical reasoning (Han et al., 2024; reasoning capabilities. This typically involves extensive Liu et al., 2020). One of the key techniques enabling these sampling at inference time guided by an strong reasoning capabilities is Chain-of-Thought (CoT) external LLM verifier, resulting in a two-player prompting (Wei et al., 2022), which allows LLMs to address system. Despite external guidance, the effectiveness complex tasks by generating a series of intermediate of this system demonstrates the potential of reasoning steps. As a result, many early efforts focus on finetuning a single LLM to tackle complex tasks. Thus, we LLMs using large-scale, high-quality CoT reasoning pose a new research problem: Can we internalize chains, either through human annotation (Hendrycks et al., the searching capabilities to fundamentally 2021a; Yue et al., 2024) or by distilling synthetic data from enhance the reasoning abilities of a single LLM? more advanced models (Yu et al., 2024; Toshniwal et al., This work explores an orthogonal direction focusing 2024a; Ding et al., 2024). However, human annotation is on post-training LLMs for autoregressive extremely labor intensive, and distillation often limits the searching (i.e., an extended reasoning process model's reasoning capabilities to certain level.

large language model, machine learning, trajectory, (19 more...)

2502.02508

Country:

Africa (0.68)
Asia (0.46)
North America > United States (0.28)
(2 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Health & Medicine (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Machine LearningFeb-19-2024

Thermometer: Towards Universal Calibration for Large Language Models

Shen, Maohao, Das, Subhro, Greenewald, Kristjan, Sattigeri, Prasanna, Wornell, Gregory, Ghosh, Soumya

We consider the issue of calibration in large language models (LLM). Recent studies have found that common interventions such as instruction tuning often result in poorly calibrated LLMs. Although calibration is well-explored in traditional applications, calibrating LLMs is uniquely challenging. These challenges stem as much from the severe computational requirements of LLMs as from their versatility, which allows them to be applied to diverse tasks. Addressing these challenges, we propose THERMOMETER, a calibration approach tailored to LLMs. THERMOMETER learns an auxiliary model, given data from multiple tasks, for calibrating a LLM. It is computationally efficient, preserves the accuracy of the LLM, and produces better-calibrated responses for new tasks. Extensive empirical evaluations across various benchmarks demonstrate the effectiveness of the proposed method.

large language model, machine learning, natural language, (16 more...)

2403.08819

Country:

Asia (0.28)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMay-31-2023

On Balancing Bias and Variance in Unsupervised Multi-Source-Free Domain Adaptation

Shen, Maohao, Bu, Yuheng, Wornell, Gregory

Due to privacy, storage, and other constraints, there is a growing need for unsupervised domain adaptation techniques in machine learning that do not require access to the data used to train a collection of source models. Existing methods for multi-source-free domain adaptation (MSFDA) typically train a target model using pseudo-labeled data produced by the source models, which focus on improving the pseudo-labeling techniques or proposing new training objectives. Instead, we aim to analyze the fundamental limits of MSFDA. In particular, we develop an information-theoretic bound on the generalization error of the resulting target model, which illustrates an inherent bias-variance trade-off. We then provide insights on how to balance this trade-off from three perspectives, including domain aggregation, selective pseudo-labeling, and joint feature alignment, which leads to the design of novel algorithms. Experiments on multiple datasets validate our theoretical analysis and demonstrate the state-of-art performance of the proposed algorithm, especially on some of the most challenging datasets, including Office-Home and DomainNet.

adaptation, artificial intelligence, machine learning, (14 more...)

2202.00796

Country:

North America > United States > Massachusetts (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

arXiv.org Artificial IntelligenceApr-30-2023

Reliable Gradient-free and Likelihood-free Prompt Tuning

Shen, Maohao, Ghosh, Soumya, Sattigeri, Prasanna, Das, Subhro, Bu, Yuheng, Wornell, Gregory

Due to privacy or commercial constraints, large pre-trained language models (PLMs) are often offered as black-box APIs. Fine-tuning such models to downstream tasks is challenging because one can neither access the model's internal representations nor propagate gradients through it. This paper addresses these challenges by developing techniques for adapting PLMs with only API access. Building on recent work on soft prompt tuning, we develop methods to tune the soft prompts without requiring gradient computation. Further, we develop extensions that in addition to not requiring gradients also do not need to access any internal representation of the PLM beyond the input embeddings. Moreover, instead of learning a single prompt, our methods learn a distribution over prompts allowing us to quantify predictive uncertainty. Ours is the first work to consider uncertainty in prompts when only having API access to the PLM. Finally, through extensive experiments, we carefully vet the proposed methods and find them competitive with (and sometimes even improving on) gradient-based approaches with full access to the PLM.

artificial intelligence, machine learning, natural language, (17 more...)

2305.00593

Country: North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceApr-27-2023

On the Generalization Error of Meta Learning for the Gibbs Algorithm

Bu, Yuheng, Tetali, Harsha Vardhan, Aminian, Gholamali, Rodrigues, Miguel, Wornell, Gregory

We analyze the generalization ability of joint-training meta learning algorithms via the Gibbs algorithm. Our exact characterization of the expected meta generalization error for the meta Gibbs algorithm is based on symmetrized KL information, which measures the dependence between all meta-training datasets and the output parameters, including task-specific and meta parameters. Additionally, we derive an exact characterization of the meta generalization error for the super-task Gibbs algorithm, in terms of conditional symmetrized KL information within the super-sample and super-task framework introduced in Steinke and Zakynthinou (2020) and Hellstrom and Durisi (2022) respectively. Our results also enable us to provide novel distribution-free generalization error upper bounds for these Gibbs algorithms applicable to meta learning.

algorithm, artificial intelligence, machine learning, (15 more...)

2304.14332

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceDec-14-2022

Post-hoc Uncertainty Learning using a Dirichlet Meta-Model

Shen, Maohao, Bu, Yuheng, Sattigeri, Prasanna, Ghosh, Soumya, Das, Subhro, Wornell, Gregory

It is known that neural networks have the problem of being over-confident when directly using the output label distribution to generate uncertainty measures. Existing methods mainly resolve this issue by retraining the entire model to impose the uncertainty quantification capability so that the learned model can achieve desired performance in accuracy and uncertainty prediction simultaneously. However, training the model from scratch is computationally expensive and may not be feasible in many situations. In this work, we consider a more practical post-hoc uncertainty learning setting, where a well-trained base model is given, and we focus on the uncertainty quantification task at the second stage of training. We propose a novel Bayesian meta-model to augment pre-trained models with better uncertainty quantification abilities, which is effective and computationally efficient. Our proposed method requires no additional training data and is flexible enough to quantify different uncertainties and easily adapt to different application settings, including out-of-domain data detection, misclassification detection, and trustworthy transfer learning. We demonstrate our proposed meta-model approach's flexibility and superior empirical performance on these applications over multiple representative image classification benchmarks.

artificial intelligence, deep learning, machine learning, (17 more...)

2212.07359

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.64)

Industry: Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Machine LearningNov-2-2021

Characterizing and Understanding the Generalization Error of Transfer Learning with Gibbs Algorithm

Bu, Yuheng, Aminian, Gholamali, Toni, Laura, Rodrigues, Miguel, Wornell, Gregory

We provide an information-theoretic analysis of the generalization ability of Gibbs-based transfer learning algorithms by focusing on two popular transfer learning approaches, $\alpha$-weighted-ERM and two-stage-ERM. Our key result is an exact characterization of the generalization behaviour using the conditional symmetrized KL information between the output hypothesis and the target training samples given the source samples. Our results can also be applied to provide novel distribution-free generalization error upper bounds on these two aforementioned Gibbs algorithms. Our approach is versatile, as it also characterizes the generalization errors and excess risks of these two Gibbs algorithms in the asymptotic regime, where they converge to the $\alpha$-weighted-ERM and two-stage-ERM, respectively. Based on our theoretical results, we show that the benefits of transfer learning can be viewed as a bias-variance trade-off, with the bias induced by the source distribution and the variance induced by the lack of target samples. We believe this viewpoint can guide the choice of transfer learning algorithms in practice.

algorithm, artificial intelligence, machine learning, (16 more...)

2111.01635

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)

arXiv.org Machine LearningJul-28-2021

Characterizing the Generalization Error of Gibbs Algorithm with Symmetrized KL information

Aminian, Gholamali, Bu, Yuheng, Toni, Laura, Rodrigues, Miguel R. D., Wornell, Gregory

Bounding the generalization error of a supervised learning algorithm is one of the most important problems in learning theory, and various approaches have been developed. However, existing bounds are often loose and lack of guarantees. As a result, they may fail to characterize the exact generalization ability of a learning algorithm. Our main contribution is an exact characterization of the expected generalization error of the well-known Gibbs algorithm in terms of symmetrized KL information between the input training samples and the output hypothesis. Such a result can be applied to tighten existing expected generalization error bound. Our analysis provides more insight on the fundamental role the symmetrized KL information plays in controlling the generalization error of the Gibbs algorithm.

artificial intelligence, evolutionary algorithm, generalization error, (15 more...)

2107.13656

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.35)

arXiv.org Machine LearningDec-30-2020

A Maximal Correlation Approach to Imposing Fairness in Machine Learning

Lee, Joshua, Bu, Yuheng, Sattigeri, Prasanna, Panda, Rameswar, Wornell, Gregory, Karlinsky, Leonid, Feris, Rogerio

As machine learning algorithms grow in popularity and diversify to many industries, ethical and legal concerns regarding their fairness have become increasingly relevant. We explore the problem of algorithmic fairness, taking an information-theoretic view. The maximal correlation framework is introduced for expressing fairness constraints and shown to be capable of being used to derive regularizers that enforce independence and separation-based fairness criteria, which admit optimization algorithms for both discrete and continuous variables which are more computationally efficient than existing algorithms. We show that these algorithms provide smooth performance-fairness tradeoff curves and perform competitively with state-of-the-art methods on both discrete datasets (COMPAS, Adult) and continuous datasets (Communities and Crimes).

employment law, fairness, labor law, (18 more...)

2012.15259

Country: North America > United States (0.68)

Genre: Research Report (0.84)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)