AITopics

2503.11099

Country:

Asia (0.67)
North America > United States > New York (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Mathematics of Computing (0.67)

arXiv.org Machine LearningNov-21-2024

Learning multivariate Gaussians with imperfect advice

Bhattacharyya, Arnab, Choo, Davin, John, Philips George, Gouleakis, Themis

The problem of approximating an underlying distribution from its observed samples is a fundamental scientific problem. The distribution learning problem has been studied for more than a century in statistics, and it is the underlying engine for much of applied machine learning. The emphasis in modern applications is on highdimensional distributions, with the goal being to understand when one can escape the curse of dimensionality. The survey by [Dia16] gives an excellent overview of classical and modern techniques for distribution learning, especially when there is some underlying structure to be exploited. In this work, we investigate how to go beyond worst case sample complexities for learning distributions. We consider the situation where the algorithm is also given the aid of possibly imperfect advice regarding the input distribution. We position our study in the context of algorithms with predictions, where the usual problem input is supplemented by "predictions" or "advice" (potentially drawn from modern machine learning models) and the algorithm's goal is to incorporate the advice in a way that improves performance if the advice is of high quality, but if the advice is inaccurate, there should not be degradation below the performance in the no-advice setting. Most previous work in this setting are in the context of online algorithms, e.g. for the ski-rental problem [GP19, WLW20, ADJ

artificial intelligence, machine learning, probability, (17 more...)

2411.127

Country: Asia (0.28)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.34)

arXiv.org Artificial IntelligenceNov-16-2024

Efficient, Low-Regret, Online Reinforcement Learning for Linear MDPs

John, Philips George, Bhattacharyya, Arnab, Maniu, Silviu, Myrisiotis, Dimitrios, Wu, Zhenan

Reinforcement learning algorithms are usually stated without theoretical guarantees regarding their performance. Recently, Jin, Yang, Wang, and Jordan (COLT 2020) showed a polynomial-time reinforcement learning algorithm (namely, LSVI-UCB) for the setting of linear Markov decision processes, and provided theoretical guarantees regarding its running time and regret. In real-world scenarios, however, the space usage of this algorithm can be prohibitive due to a utilized linear regression step. We propose and analyze two modifications of LSVI-UCB, which alternate periods of learning and not-learning, to reduce space and time usage while maintaining sublinear regret. We show experimentally, on synthetic data and real-world benchmarks, that our algorithms achieve low space usage and running time, while not significantly sacrificing regret.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2411.10906

Country:

North America > Canada (0.28)
Europe > Austria (0.28)
Asia > Middle East > Jordan (0.24)

Genre:

Research Report (1.00)
Instructional Material > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

arXiv.org Machine LearningNov-12-2024

Probably approximately correct high-dimensional causal effect estimation given a valid adjustment set

Choo, Davin, Squires, Chandler, Bhattacharyya, Arnab, Sontag, David

Accurate estimates of causal effects play a key role in decision-making across applications such as healthcare, economics, and operations. In the absence of randomized experiments, a common approach to estimating causal effects uses \textit{covariate adjustment}. In this paper, we study covariate adjustment for discrete distributions from the PAC learning perspective, assuming knowledge of a valid adjustment set $\bZ$, which might be high-dimensional. Our first main result PAC-bounds the estimation error of covariate adjustment by a term that is exponential in the size of the adjustment set; it is known that such a dependency is unavoidable even if one only aims to minimize the mean squared error. Motivated by this result, we introduce the notion of an \emph{$\eps$-Markov blanket}, give bounds on the misspecification error of using such a set for covariate adjustment, and provide an algorithm for $\eps$-Markov blanket discovery; our second main result upper bounds the sample complexity of this algorithm. Furthermore, we provide a misspecification error bound and a constraint-based algorithm that allow us to go beyond $\eps$-Markov blankets to even smaller adjustment sets. Our third main result upper bounds the sample complexity of this algorithm, and our final result combines the first three into an overall PAC bound. Altogether, our results highlight that one does not need to perfectly recover causal structure in order to ensure accurate estimates of causal effects.

artificial intelligence, correct high-dimensional causal effect estimation, valid adjustment

2411.08141

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence (0.53)

arXiv.org Machine LearningMay-23-2024

Online bipartite matching with imperfect advice

Choo, Davin, Gouleakis, Themis, Ling, Chun Kai, Bhattacharyya, Arnab

We study the problem of online unweighted bipartite matching with $n$ offline vertices and $n$ online vertices where one wishes to be competitive against the optimal offline algorithm. While the classic RANKING algorithm of Karp et al. [1990] provably attains competitive ratio of $1-1/e > 1/2$, we show that no learning-augmented method can be both 1-consistent and strictly better than $1/2$-robust under the adversarial arrival model. Meanwhile, under the random arrival model, we show how one can utilize methods from distribution testing to design an algorithm that takes in external advice about the online vertices and provably achieves competitive ratio interpolating between any ratio attainable by advice-free methods and the optimal ratio of 1, depending on the advice quality.

algorithm, artificial intelligence, machine learning, (17 more...)

2405.09784

Genre: Research Report (0.82)

Industry: Marketing (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningMay-13-2024

Distribution Learning Meets Graph Structure Sampling

Bhattacharyya, Arnab, Gayen, Sutanu, John, Philips George, Sen, Sayantan, Vinodchandran, N. V.

This work establishes a novel link between the problem of PAC-learning high-dimensional graphical models and the task of (efficient) counting and sampling of graph structures, using an online learning framework. We observe that if we apply the exponentially weighted average (EWA) or randomized weighted majority (RWM) forecasters on a sequence of samples from a distribution P using the log loss function, the average regret incurred by the forecaster's predictions can be used to bound the expected KL divergence between P and the predictions. Known regret bounds for EWA and RWM then yield new sample complexity bounds for learning Bayes nets. Moreover, these algorithms can be made computationally efficient for several interesting classes of Bayes nets. Specifically, we give a new sample-optimal and polynomial time learning algorithm with respect to trees of unknown structure and the first polynomial sample and time algorithm for learning with respect to Bayes nets over a given chordal skeleton.

artificial intelligence, learning meet graph structure sampling, machine learning

2405.07914

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningMar-14-2024

Outlier Robust Multivariate Polynomial Regression

Arora, Vipul, Bhattacharyya, Arnab, Boban, Mathews, Guruswami, Venkatesan, Kelman, Esty

We study the problem of robust multivariate polynomial regression: let $p\colon\mathbb{R}^n\to\mathbb{R}$ be an unknown $n$-variate polynomial of degree at most $d$ in each variable. We are given as input a set of random samples $(\mathbf{x}_i,y_i) \in [-1,1]^n \times \mathbb{R}$ that are noisy versions of $(\mathbf{x}_i,p(\mathbf{x}_i))$. More precisely, each $\mathbf{x}_i$ is sampled independently from some distribution $\chi$ on $[-1,1]^n$, and for each $i$ independently, $y_i$ is arbitrary (i.e., an outlier) with probability at most $\rho < 1/2$, and otherwise satisfies $|y_i-p(\mathbf{x}_i)|\leq\sigma$. The goal is to output a polynomial $\hat{p}$, of degree at most $d$ in each variable, within an $\ell_\infty$-distance of at most $O(\sigma)$ from $p$. Kane, Karmalkar, and Price [FOCS'17] solved this problem for $n=1$. We generalize their results to the $n$-variate setting, showing an algorithm that achieves a sample complexity of $O_n(d^n\log d)$, where the hidden constant depends on $n$, if $\chi$ is the $n$-dimensional Chebyshev distribution. The sample complexity is $O_n(d^{2n}\log d)$, if the samples are drawn from the uniform distribution instead. The approximation error is guaranteed to be at most $O(\sigma)$, and the run-time depends on $\log(1/\sigma)$. In the setting where each $\mathbf{x}_i$ and $y_i$ are known up to $N$ bits of precision, the run-time's dependence on $N$ is linear. We also show that our sample complexities are optimal in terms of $d^n$. Furthermore, we show that it is possible to have the run-time be independent of $1/\sigma$, at the cost of a higher sample complexity.

artificial intelligence, machine learning, polynomial, (13 more...)

2403.09465

Country:

North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Artificial IntelligenceFeb-9-2024

Optimal estimation of Gaussian (poly)trees

Wang, Yuhao, Gao, Ming, Tai, Wai Ming, Aragam, Bryon, Bhattacharyya, Arnab

We develop optimal algorithms for learning undirected Gaussian trees and directed Gaussian polytrees from data. We consider both problems of distribution learning (i.e. in KL distance) and structure learning (i.e. exact recovery). The first approach is based on the Chow-Liu algorithm, and learns an optimal tree-structured distribution efficiently. The second approach is a modification of the PC algorithm for polytrees that uses partial correlation as a conditional independence tester for constraint-based structure learning. We derive explicit finite-sample guarantees for both approaches, and show that both approaches are optimal by deriving matching lower bounds. Additionally, we conduct numerical experiments to compare the performance of various algorithms, providing further insights and empirical evidence.

algorithm, artificial intelligence, machine learning, (14 more...)

2402.0638

Country:

North America > United States (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

arXiv.org Machine LearningOct-10-2023

Learning bounded-degree polytrees with known skeleton

Choo, Davin, Yang, Joy Qiping, Bhattacharyya, Arnab, Canonne, Clément L.

We establish finite-sample guarantees for efficient proper learning of bounded-degree polytrees, a rich class of high-dimensional probability distributions and a subclass of Bayesian networks, a widely-studied type of graphical model. Recently, Bhattacharyya et al. (2021) obtained finite-sample guarantees for recovering tree-structured Bayesian networks, i.e., 1-polytrees. We extend their results by providing an efficient algorithm which learns $d$-polytrees in polynomial time and sample complexity for any bounded $d$ when the underlying undirected graph (skeleton) is known. We complement our algorithm with an information-theoretic sample complexity lower bound, showing that the dependence on the dimension and target accuracy parameters are nearly tight.

artificial intelligence, learning bounded-degree polytree, skeleton

2310.06333

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.73)

arXiv.org Artificial IntelligenceSep-16-2023

Total Variation Distance Estimation Is as Easy as Probabilistic Inference

Bhattacharyya, Arnab, Gayen, Sutanu, Meel, Kuldeep S., Myrisiotis, Dimitrios, Pavan, A., Vinodchandran, N. V.

Machine learning and data science heavily rely on probability distributions that are widely used to capture dependencies among large number of variables. Such high-dimensional distributions naturally appear in various domains including neuroscience [ROL02, CTY06], bioinformatics [BB01], text and image processing [Mur22], and causal inference [Pea09]. Substantial research has been devoted to developing models that represent high-dimensional probability distributions succinctly. One prevalent approach is through graphical models. In a graphical model, a graph describes the conditional dependencies among variables and the probability distribution is factorized according to the adjacency relationships in the graph [KF09]. When the underlying graph is a directed graph, the model is known as a Bayesian network or Bayes net. Two fundamental computational tasks on distributions are distance computation and probabilistic inference. In this work, we establish a novel connection between these two seemingly different computational tasks.

artificial intelligence, bayes net, machine learning, (19 more...)

2309.09134

Country:

Asia (0.46)
North America > United States > Nebraska (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)