AITopics | Computational Learning Theory

Collaborating Authors

Computational Learning Theory

In computer science, computational learning theory (or just learning theory) is a subfield of Artificial Intelligence devoted to studying the design and analysis of machine learning algorithms (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

D'ya like DAGs? A Survey on Structure Learning and Causal Discovery

Vowels, Matthew J., Camgoz, Necati Cihan, Bowden, Richard

arXiv.org Machine LearningMar-4-2021

It is important for a broad range of applications, including policy making [136], medical imaging [30], advertisement [22], the development of medical treatments [189], the evaluation of evidence within legal frameworks [183, 218], social science [82, 96, 246], biology [235], and many others. It is also a burgeoning topic in machine learning and artificial intelligence [17, 66, 76, 144, 210, 247, 255], where it has been argued that a consideration for causality is crucial for reasoning about the world. In order to discover causal relations, and thereby gain causal understanding, one may perform interventions and manipulations as part of a randomized experiment. These experiments may not only allow researchers or agents to identify causal relationships, but also to estimate the magnitude of these relationships. Unfortunately, in many cases, it may not be possible to undertake such experiments due to prohibitive cost, ethical concerns, or impracticality.

arxiv, constraint, graph, (14 more...)

arXiv.org Machine Learning

2103.02582

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
(7 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.86)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.45)
Health & Medicine > Therapeutic Area > Immunology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Parsimonious Inference

Duersch, Jed A., Catanach, Thomas A.

arXiv.org Machine LearningMar-2-2021

Bayesian inference provides a uniquely rigorous approach to obtain principled justification for uncertainty in predictions, yet it is difficult to articulate suitably general prior belief in the machine learning context, where computational architectures are pure abstractions subject to frequent modifications by practitioners attempting to improve results. Parsimonious inference is an information-theoretic formulation of inference over arbitrary architectures that formalizes Occam's Razor; we prefer simple and sufficient explanations. Our universal hyperprior assigns plausibility to prior descriptions, encoded as sequences of symbols, by expanding on the core relationships between program length, Kolmogorov complexity, and Solomonoff's algorithmic probability. We then cast learning as information minimization over our composite change in belief when an architecture is specified, training data are observed, and model parameters are inferred. By distinguishing model complexity from prediction information, our framework also quantifies the phenomenon of memorization. Although our theory is general, it is most critical when datasets are limited, e.g. small or skewed. We develop novel algorithms for polynomial regression and random forests that are suitable for such data, as demonstrated by our experiments. Our approaches combine efficient encodings with prudent sampling strategies to construct predictive ensembles without cross-validation, thus addressing a fundamental challenge in how to efficiently obtain predictions from data.

complexity, information, prediction, (17 more...)

arXiv.org Machine Learning

2103.02165

Country: North America > United States > California > Alameda County > Livermore (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

Add feedback

The Mathematics of Machine Learning

#artificialintelligenceFeb-23-2021, 22:01:36 GMT

In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, I have observed that some actually lack the necessary mathematical intuition and framework to get useful results. This is the main reason I decided to write this blog post. Recently, there has been an upsurge in the availability of many easy-to-use machine and deep learning packages such as scikit-learn, Weka, Tensorflow, R-caret etc. Machine Learning theory is a field that intersects statistical, probabilistic, computer science and algorithmic aspects arising from learning iteratively from data and finding hidden insights which can be used to build intelligent applications. Despite the immense possibilities of Machine and Deep Learning, a thorough mathematical understanding of many of these techniques is necessary for a good grasp of the inner workings of the algorithms and getting good results. What Level of Maths Do You Need?

blog post, machine learning, mathematics

#artificialintelligence

Industry: Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Online Learning via Offline Greedy Algorithms: Applications in Market Design and Optimization

Niazadeh, Rad, Golrezaei, Negin, Wang, Joshua, Susan, Fransisca, Badanidiyuru, Ashwinkumar

arXiv.org Machine LearningFeb-18-2021

Motivated by online decision-making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors. For such problems, we provide a general framework that efficiently transforms offline robust greedy algorithms to online ones using Blackwell approachability. We show that the resulting online algorithms have $O(\sqrt{T})$ (approximate) regret under the full information setting. We further introduce a bandit extension of Blackwell approachability that we call Bandit Blackwell approachability. We leverage this notion to transform greedy robust offline algorithms into a $O(T^{2/3})$ (approximate) regret in the bandit setting. Demonstrating the flexibility of our framework, we apply our offline-to-online transformation to several problems at the intersection of revenue management, market design, and online optimization, including product ranking optimization in online platforms, reserve price optimization in auctions, and submodular maximization. We show that our transformation, when applied to these applications, leads to new regret bounds or improves the current known bounds.

algorithm, blackwell sequential game, online learning, (14 more...)

arXiv.org Machine Learning

2102.1105

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Industry:

Education > Educational Setting > Online (0.66)
Information Technology > Services (0.45)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.66)

Add feedback

Smoothed Analysis with Adaptive Adversaries

Haghtalab, Nika, Roughgarden, Tim, Shetty, Abhishek

arXiv.org Machine LearningFeb-16-2021

We prove novel algorithmic guarantees for several online problems in the smoothed analysis model. In this model, at each time an adversary chooses an input distribution with density function bounded above by $\tfrac{1}{\sigma}$ times that of the uniform distribution; nature then samples an input from this distribution. Crucially, our results hold for {\em adaptive} adversaries that can choose an input distribution based on the decisions of the algorithm and the realizations of the inputs in the previous time steps. This paper presents a general technique for proving smoothed algorithmic guarantees against adaptive adversaries, in effect reducing the setting of adaptive adversaries to the simpler case of oblivious adversaries. We apply this technique to prove strong smoothed guarantees for three problems: -Online learning: We consider the online prediction problem, where instances are generated from an adaptive sequence of $\sigma$-smooth distributions and the hypothesis class has VC dimension $d$. We bound the regret by $\tilde{O}\big(\sqrt{T d\ln(1/\sigma)} + d\sqrt{\ln(T/\sigma)}\big)$. This answers open questions of [RST11,Hag18]. -Online discrepancy minimization: We consider the online Koml\'os problem, where the input is generated from an adaptive sequence of $\sigma$-smooth and isotropic distributions on the $\ell_2$ unit ball. We bound the $\ell_\infty$ norm of the discrepancy vector by $\tilde{O}\big(\ln^2\!\big( \frac{nT}{\sigma}\big) \big)$. -Dispersion in online optimization: We consider online optimization of piecewise Lipschitz functions where functions with $\ell$ discontinuities are chosen by a smoothed adaptive adversary and show that the resulting sequence is $\big( {\sigma}/{\sqrt{T\ell}}, \tilde O\big(\sqrt{T\ell} \big)\big)$-dispersed. This matches the parameters of [BDV18] for oblivious adversaries, up to log factors.

adversary, algorithm, sequence, (17 more...)

arXiv.org Machine Learning

2102.08446

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Add feedback

Efficient Learning with Arbitrary Covariate Shift

Kalai, Adam, Kanade, Varun

arXiv.org Machine LearningFeb-15-2021

We give an efficient algorithm for learning a binary function in a given class C of bounded VC dimension, with training data distributed according to P and test data according to Q, where P and Q may be arbitrary distributions over X. This is the generic form of what is called covariate shift, which is impossible in general as arbitrary P and Q may not even overlap. However, recently guarantees were given in a model called PQ-learning (Goldwasser et al., 2020) where the learner has: (a) access to unlabeled test examples from Q (in addition to labeled samples from P, i.e., semi-supervised learning); and (b) the option to reject any example and abstain from classifying it (i.e., selective classification). The algorithm of Goldwasser et al. (2020) requires an (agnostic) noise tolerant learner for C. The present work gives a polynomial-time PQ-learning algorithm that uses an oracle to a "reliable" learner for C, where reliable learning (Kalai et al., 2012) is a model of learning with one-sided noise. Furthermore, our reduction is optimal in the sense that we show the equivalence of reliable and PQ learning.

algorithm, learner, learning, (15 more...)

arXiv.org Machine Learning

2102.07802

Country:

Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.68)

Add feedback

Sample-Optimal PAC Learning of Halfspaces with Malicious Noise

Shen, Jie

arXiv.org Machine LearningFeb-11-2021

We study efficient PAC learning of homogeneous halfspaces in $\mathbb{R}^d$ in the presence of malicious noise of Valiant~(1985). This is a challenging noise model and only until recently has near-optimal noise tolerance bound been established under the mild condition that the unlabeled data distribution is isotropic log-concave. However, it remains unsettled how to obtain the optimal sample complexity simultaneously. In this work, we present a new analysis for the algorithm of Awasthi et al.~(2017) and show that it essentially achieves the near-optimal sample complexity bound of $\tilde{O}(d)$, improving the best known result of $\tilde{O}(d^2)$. Our main ingredient is a novel incorporation of a Matrix Chernoff-type inequality to bound the spectrum of an empirical covariance matrix for well-behaved distributions, in conjunction with a careful exploration of the localization schemes of Awasthi et al.~(2017). We further extend the algorithm and analysis to the more general and stronger nasty noise model of Bshouty~et~al. (2002), showing that it is still possible to achieve near-optimal noise tolerance and sample complexity in polynomial time.

noise, probability 1, sample complexity, (14 more...)

arXiv.org Machine Learning

2102.06247

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Fairness-Aware Learning from Corrupted Data

Konstantinov, Nikola, Lampert, Christoph H.

arXiv.org Machine LearningFeb-11-2021

Addressing fairness concerns about machine learning models is a crucial step towards their long-term adoption in real-world automated systems. While many approaches have been developed for training fair models from data, little is known about the effects of data corruption on these methods. In this work we consider fairness-aware learning under arbitrary data manipulations. We show that an adversary can force any learner to return a biased classifier, with or without degrading accuracy, and that the strength of this bias increases for learning problems with underrepresented protected groups in the data. We also provide upper bounds that match these hardness results up to constant factors, by proving that two natural learning algorithms achieve order-optimal guarantees in terms of both accuracy and fairness under adversarial data manipulations.

adversary, fairness-aware learning, probability, (13 more...)

arXiv.org Machine Learning

2102.06004

Country:

Europe > Austria (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Add feedback

Agnostic Proper Learning of Halfspaces under Gaussian Marginals

Diakonikolas, Ilias, Kane, Daniel M., Kontonis, Vasilis, Tzamos, Christos, Zarifis, Nikos

arXiv.org Machine LearningFeb-10-2021

We study the problem of agnostically learning halfspaces under the Gaussian distribution. Our main result is the {\em first proper} learning algorithm for this problem whose sample complexity and computational complexity qualitatively match those of the best known improper agnostic learner. Building on this result, we also obtain the first proper polynomial-time approximation scheme (PTAS) for agnostically learning homogeneous halfspaces. Our techniques naturally extend to agnostically learning linear models with respect to other non-linear activations, yielding in particular the first proper agnostic algorithm for ReLU regression.

algorithm, halfspace, polynomial, (16 more...)

arXiv.org Machine Learning

2102.05629

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.67)

Add feedback

Patterns, predictions, and actions: A story about machine learning

Hardt, Moritz, Recht, Benjamin

arXiv.org Machine LearningFeb-9-2021

This graduate textbook on machine learning tells a story of how patterns in data support predictions and consequential actions. Starting with the foundations of decision making, we cover representation, optimization, and generalization as the constituents of supervised learning. A chapter on datasets as benchmarks examines their histories and scientific bases. Self-contained introductions to causality, the practice of causal inference, sequential decision making, and reinforcement learning equip the reader with concepts and tools to reason about actions and their consequences. Throughout, the text discusses historical context and societal impact. We invite readers from all backgrounds; some experience with probability, calculus, and linear algebra suffices.

law enforcement, simultaneous perturbation gradient approximation, vascular disease, (35 more...)

arXiv.org Machine Learning

2102.05242

Country:

North America > United States > California (0.67)
Europe > United Kingdom > England (0.28)
North America > United States > Texas (0.13)
North America > United States > Pennsylvania (0.13)

Genre:

Workflow (1.00)
Summary/Review (1.00)
Research Report > New Finding (1.00)
(4 more...)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(10 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(10 more...)

Add feedback