AITopics

2412.02861

Country: Europe > Sweden (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceOct-21-2024

Information-Theoretic Minimax Regret Bounds for Reinforcement Learning based on Duality

Bongole, Raghav, Gouverneur, Amaury, Rodríguez-Gálvez, Borja, Oechtering, Tobias J., Skoglund, Mikael

We study agents acting in an unknown environment where the agent's goal is to find a robust policy. We consider robust policies as policies that achieve high cumulative rewards for all possible environments. To this end, we consider agents minimizing the maximum regret over different environment parameters, leading to the study of minimax regret. This research focuses on deriving information-theoretic bounds for minimax regret in Markov Decision Processes (MDPs) with a finite time horizon. Building on concepts from supervised learning, such as minimum excess risk (MER) and minimax excess risk, we use recent bounds on the Bayesian regret to derive minimax regret bounds. Specifically, we establish minimax theorems and use bounds on the Bayesian regret to perform minimax regret analysis using these minimax theorems. Our contributions include defining a suitable minimax regret in the context of MDPs, finding information-theoretic bounds for it, and applying these bounds in various scenarios.

artificial intelligence, machine learning, minimax regret, (14 more...)

2410.16013

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningJul-10-2024

A Coding-Theoretic Analysis of Hyperspherical Prototypical Learning Geometry

Lindström, Martin, Rodríguez-Gálvez, Borja, Thobaben, Ragnar, Skoglund, Mikael

Hyperspherical Prototypical Learning (HPL) is a supervised approach to representation learning that designs class prototypes on the unit hypersphere. The prototypes bias the representations to class separation in a scale invariant and known geometry. Previous approaches to HPL have either of the following shortcomings: (i) they follow an unprincipled optimisation procedure; or (ii) they are theoretically sound, but are constrained to only one possible latent dimension. In this paper, we address both shortcomings. To address (i), we present a principled optimisation procedure whose solution we show is optimal. To address (ii), we construct well-separated prototypes in a wide range of dimensions using linear block codes. Additionally, we give a full characterisation of the optimal prototype placement in terms of achievable and converse bounds, showing that our proposed methods are near-optimal.

artificial intelligence, machine learning, natural language, (17 more...)

2407.07664

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.64)

Industry: Education > Curriculum > Subject-Specific Education (0.42)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Machine LearningMar-25-2024

A note on generalization bounds for losses with finite moments

Rodríguez-Gálvez, Borja, Rivasplata, Omar, Thobaben, Ragnar, Skoglund, Mikael

This paper studies the truncation method from Alquier [1] to derive high-probability PAC-Bayes bounds for unbounded losses with heavy tails. Assuming that the $p$-th moment is bounded, the resulting bounds interpolate between a slow rate $1 / \sqrt{n}$ when $p=2$, and a fast rate $1 / n$ when $p \to \infty$ and the loss is essentially bounded. Moreover, the paper derives a high-probability PAC-Bayes bound for losses with a bounded variance. This bound has an exponentially better dependence on the confidence parameter and the dependency measure than previous bounds in the literature. Finally, the paper extends all results to guarantees in expectation and single-draw PAC-Bayes. In order to so, it obtains analogues of the PAC-Bayes fast rate bound for bounded losses from [2] in these settings.

artificial intelligence, machine learning, pac-bayes, (18 more...)

2403.16681

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Machine LearningMar-5-2024

Chained Information-Theoretic bounds and Tight Regret Rate for Linear Bandit Problems

Gouverneur, Amaury, Rodríguez-Gálvez, Borja, Oechtering, Tobias J., Skoglund, Mikael

Bandit problems are a class of decision problems in which an agent interacts sequentially with an unknown environment by choosing actions and earning rewards in return. The goal of the agent is to maximize its expected cumulative reward, which is the expected sum of rewards that it will earn throughout its interaction with the environment. This necessitates a delicate balance between the exploration of different actions to gather information for potential future rewards, and the exploitation of known actions to receive immediate gains. The theoretical study of the performance of an algorithm in a bandit problem is done by analyzing the expected regret, which is defined as the difference between the cumulative reward of the algorithm and the hypothetical cumulative reward that an oracle would obtain by choosing the optimal action at each time step. An effective method for achieving small regret is the Thomson Sampling (TS) algorithm [3], which, despite its simplicity, has shown remarkable performance [4, 5, 6]. Studying the Thomspon Sampling regret, [1] introduced the concept of information ratio, a statistic that captures the trade-off between the information gained by the algorithm about the environment and the immediate regret.

artificial intelligence, bandit problem, machine learning, (17 more...)

2403.03361

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Data Science > Data Mining > Big Data (0.84)

arXiv.org Artificial IntelligenceDec-9-2023

The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning

Rodríguez-Gálvez, Borja, Blaas, Arno, Rodríguez, Pau, Goliński, Adam, Suau, Xavier, Ramapuram, Jason, Busbridge, Dan, Zappella, Luca

The mechanisms behind the success of multi-view self-supervised learning (MVSSL) are not yet fully understood. Contrastive MVSSL methods have been studied through the lens of InfoNCE, a lower bound of the Mutual Information (MI). However, the relation between other MVSSL methods and MI remains unclear. We consider a different lower bound on the MI consisting of an entropy and a reconstruction term (ER), and analyze the main MVSSL families through its lens. Through this ER bound, we show that clustering-based methods such as DeepCluster and SwAV maximize the MI. We also re-interpret the mechanisms of distillation-based approaches such as BYOL and DINO, showing that they explicitly maximize the reconstruction term and implicitly encourage a stable entropy, and we confirm this empirically. We show that replacing the objectives of common MVSSL methods with this ER bound achieves competitive performance, while making them stable when training with smaller batch sizes or smaller exponential moving average (EMA) coefficients. Github repo: https://github.com/apple/ml-entropy-reconstruction.

artificial intelligence, machine learning, projection, (16 more...)

2307.10907

Country: North America > United States > Hawaii (0.14)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)

arXiv.org Machine LearningNov-8-2023

More PAC-Bayes bounds: From bounded losses, to losses with general tail behaviors, to anytime-validity

Rodríguez-Gálvez, Borja, Thobaben, Ragnar, Skoglund, Mikael

In this paper, we present new high-probability PAC-Bayes bounds for different types of losses. Firstly, for losses with a bounded range, we recover a strengthened version of Catoni's bound that holds uniformly for all parameter values. This leads to new fast rate and mixed rate bounds that are interpretable and tighter than previous bounds in the literature. In particular, the fast rate bound is equivalent to the Seeger--Langford bound. Secondly, for losses with more general tail behaviors, we introduce two new parameter-free bounds: a PAC-Bayes Chernoff analogue when the loss' cumulative generating function is bounded, and a bound when the loss' second moment is bounded. These two bounds are obtained using a new technique based on a discretization of the space of possible events for the "in probability" parameter optimization problem. This technique is both simpler and more general than previous approaches optimizing over a grid on the parameters' space. Finally, we extend all previous results to anytime-valid bounds using a simple technique applicable to any existing bound.

artificial intelligence, machine learning, pac-bayes, (18 more...)

2306.12214

Country:

Europe > United Kingdom > England (0.14)
Oceania > Australia (0.14)
North America > United States (0.14)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceJul-13-2023

Limitations of Information-Theoretic Generalization Bounds for Gradient Descent Methods in Stochastic Convex Optimization

Haghifam, Mahdi, Rodríguez-Gálvez, Borja, Thobaben, Ragnar, Skoglund, Mikael, Roy, Daniel M., Dziugaite, Gintare Karolina

To date, no "information-theoretic" frameworks for reasoning about generalization error have been shown to establish minimax rates for gradient descent in the setting of stochastic convex optimization. In this work, we consider the prospect of establishing such rates via several existing information-theoretic frameworks: input-output mutual information bounds, conditional mutual information bounds and variants, PAC-Bayes bounds, and recent conditional variants thereof. We prove that none of these bounds are able to establish minimax rates. We then consider a common tactic employed in studying gradient methods, whereby the final iterate is corrupted by Gaussian noise, producing a noisy "surrogate" algorithm. We prove that minimax rates cannot be established via the analysis of such surrogates. Our results suggest that new ideas are required to analyze gradient descent using information-theoretic techniques.

algorithm, artificial intelligence, machine learning, (18 more...)

2212.13556

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England (0.14)
Europe > Spain (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.91)

arXiv.org Artificial IntelligenceApr-26-2023

Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian rewards

Gouverneur, Amaury, Rodríguez-Gálvez, Borja, Oechtering, Tobias J., Skoglund, Mikael

In this work, we study the performance of the Thompson Sampling algorithm for Contextual Bandit problems based on the framework introduced by Neu et al. and their concept of lifted information ratio. First, we prove a comprehensive bound on the Thompson Sampling expected cumulative regret that depends on the mutual information of the environment parameters and the history. Then, we introduce new bounds on the lifted information ratio that hold for sub-Gaussian rewards, thus generalizing the results from Neu et al. which analysis requires binary rewards. Finally, we provide explicit regret bounds for the special cases of unstructured bounded contextual bandits, structured bounded contextual bandits with Laplace likelihood, structured Bernoulli bandits, and bounded linear contextual bandits.

bandit, data mining, machine learning, (20 more...)

2304.13593

Country: Europe > France (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.69)

arXiv.org Machine LearningJan-22-2021

Tighter expected generalization error bounds via Wasserstein distance

Rodríguez-Gálvez, Borja, Bassi, Germán, Thobaben, Ragnar, Skoglund, Mikael

In this work, we introduce several expected generalization error bounds based on the Wasserstein distance. More precisely, we present full-dataset, single-letter, and random-subset bounds on both the standard setting and the randomized-subsample setting from Steinke and Zakynthinou [2020]. Moreover, we show that, when the loss function is bounded, these bounds recover from below (and thus are tighter than) current bounds based on the relative entropy and, for the standard setting, generate new, non-vacuous bounds also based on the relative entropy. Then, we show how similar bounds featuring the backward channel can be derived with the proposed proof techniques. Finally, we show how various new bounds based on different information measures (e.g., the lautum information or several $f$-divergences) can be derived from the presented bounds.

artificial intelligence, evolutionary algorithm, inequality, (12 more...)

2101.09315

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)