AITopics | Computational Learning Theory

Collaborating Authors

Computational Learning Theory

In computer science, computational learning theory (or just learning theory) is a subfield of Artificial Intelligence devoted to studying the design and analysis of machine learning algorithms (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

On the Power of Localized Perceptron for Label-Optimal Learning of Halfspaces with Adversarial Noise

Shen, Jie

arXiv.org Machine LearningDec-19-2020

We study {\em online} active learning of homogeneous $s$-sparse halfspaces in $\mathbb{R}^d$ with adversarial noise \cite{kearns1992toward}, where the overall probability of a noisy label is constrained to be at most $\nu$ and the marginal distribution over unlabeled data is unchanged. Our main contribution is a state-of-the-art online active learning algorithm that achieves near-optimal attribute efficiency, label and sample complexity under mild distributional assumptions. In particular, under the conditions that the marginal distribution is isotropic log-concave and $\nu = \Omega(\epsilon)$, where $\epsilon \in (0, 1)$ is the target error rate, we show that our algorithm PAC learns the underlying halfspace in polynomial time with near-optimal label complexity bound of $\tilde{O}\big(s \cdot polylog(d, \frac{1}{\epsilon})\big)$ and sample complexity bound of $\tilde{O}\big(\frac{s}{\epsilon} \cdot polylog(d)\big)$. Prior to this work, existing online algorithms designed for tolerating the adversarial noise are either subject to label complexity polynomial in $d$ or $\frac{1}{\epsilon}$, or work under the restrictive uniform marginal distribution. As an immediate corollary of our main result, we show that under the more challenging agnostic model \cite{kearns1992toward} where no assumption is made on the noise rate, our active learner achieves an error rate of $O(OPT) + \epsilon$ with the same running time and label and sample complexity, where $OPT$ is the best possible error rate achievable by any homogeneous $s$-sparse halfspace. Our algorithm builds upon the celebrated Perceptron while leveraging novel localized sampling and semi-random gradient update to tolerate the adversarial noise. We believe that our algorithmic design and analysis are of independent interest, and may shed light on learning halfspaces with broader noise models.

algorithm, complexity, learning, (14 more...)

arXiv.org Machine Learning

2012.10793

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.70)

Add feedback

Communication-Aware Collaborative Learning

Blum, Avrim, Heinecke, Shelby, Reyzin, Lev

arXiv.org Machine LearningDec-18-2020

Algorithms for noiseless collaborative PAC learning have been analyzed and optimized in recent years with respect to sample complexity. In this paper, we study collaborative PAC learning with the goal of reducing communication cost at essentially no penalty to the sample complexity. We develop communication efficient collaborative PAC learning algorithms using distributed boosting. We then consider the communication cost of collaborative learning in the presence of classification noise. As an intermediate step, we show how collaborative PAC learning algorithms can be adapted to handle classification noise. With this insight, we develop communication efficient algorithms for collaborative PAC learning robust to classification noise.

complexity, learning, sample complexity, (12 more...)

arXiv.org Machine Learning

2012.10569

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Plymouth County > Norwell (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

On the Complexity of Learning a Class Ratio from Unlabeled Data

Fish, Benjamin, Reyzin, Lev

Journal of Artificial Intelligence ResearchDec-17-2020

In the problem of learning a class ratio from unlabeled data, which we call CR learning, the training data is unlabeled, and only the ratios, or proportions, of examples receiving each label are given. The goal is to learn a hypothesis that predicts the proportions of labels on the distribution underlying the sample. This model of learning is applicable to a wide variety of settings, including predicting the number of votes for candidates in political elections from polls. In this paper, we formally define this class and resolve foundational questions regarding the computational complexity of CR learning and characterize its relationship to PAC learning. Among our results, we show, perhaps surprisingly, that for finite VC classes what can be efficiently CR learned is a strict subset of what can be learned efficiently in PAC, under standard complexity assumptions. We also show that there exist classes of functions whose CR learnability is independent of ZFC, the standard set theoretic axioms. This implies that CR learning cannot be easily characterized (like PAC by VC dimension).

proportion, subset, vc dimension, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12013

AI Access Foundation

12013

Journal of Artificial Intelligence Research

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(10 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Hardness of Learning Halfspaces with Massart Noise

Diakonikolas, Ilias, Kane, Daniel M.

arXiv.org Machine LearningDec-17-2020

We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise. Specifically, given labeled examples $(x, y)$ from a distribution $D$ on $\mathbb{R}^{n} \times \{ \pm 1\}$ such that the marginal distribution on $x$ is arbitrary and the labels are generated by an unknown halfspace corrupted with Massart noise at rate $\eta<1/2$, we want to compute a hypothesis with small misclassification error. Characterizing the efficient learnability of halfspaces in the Massart model has remained a longstanding open problem in learning theory. Recent work gave a polynomial-time learning algorithm for this problem with error $\eta+\epsilon$. This error upper bound can be far from the information-theoretically optimal bound of $\mathrm{OPT}+\epsilon$. More recent work showed that {\em exact learning}, i.e., achieving error $\mathrm{OPT}+\epsilon$, is hard in the Statistical Query (SQ) model. In this work, we show that there is an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a polynomial-time SQ algorithm. In particular, our lower bound implies that no efficient SQ algorithm can approximate the optimal error within any polynomial factor.

algorithm, halfspace, proceedings, (17 more...)

arXiv.org Machine Learning

2012.0972

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Notes on Deep Learning Theory

Golikov, Eugene A.

arXiv.org Artificial IntelligenceDec-10-2020

These are the notes for the lectures that I was giving during Fall 2020 at the Moscow Institute of Physics and Technology (MIPT) and at the Yandex School of Data Analysis (YSDA). The notes cover some aspects of initialization, loss landscape, generalization, and a neural tangent kernel theory. While many other topics (e.g. expressivity, a mean-field theory, a double descent phenomenon) are missing in the current version, we plan to add them in future revisions.

initialization, kernel, matrix, (16 more...)

arXiv.org Artificial Intelligence

2012.0576

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.24)
Asia > Middle East > Jordan (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.64)

Add feedback

On Irrelevant Literals in Pseudo-Boolean Constraint Learning

Berre, Danel Le, Marquis, Pierre, Mengel, Stefan, Wallon, Romain

arXiv.org Artificial IntelligenceDec-8-2020

Learning pseudo-Boolean (PB) constraints in PB solvers exploiting cutting planes based inference is not as well understood as clause learning in conflict-driven clause learning solvers. In this paper, we show that PB constraints derived using cutting planes may contain \emph{irrelevant literals}, i.e., literals whose assigned values (whatever they are) never change the truth value of the constraint. Such literals may lead to infer constraints that are weaker than they should be, impacting the size of the proof built by the solver, and thus also affecting its performance. This suggests that current implementations of PB solvers based on cutting planes should be reconsidered to prevent the generation of irrelevant literals. Indeed, detecting and removing irrelevant literals is too expensive in practice to be considered as an option (the associated problem is NP-hard.

constraint, pb solver, solver, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.24963/ijcai.2020/160

2012.04424

Country: Europe > France > Hauts-de-France (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Add feedback

Mapping Network States Using Connectivity Queries

Rodríguez, Alexander, Adhikari, Bijaya, González, Andrés D., Nicholson, Charles, Vullikanti, Anil, Prakash, B. Aditya

arXiv.org Artificial IntelligenceDec-7-2020

Can we infer all the failed components of an infrastructure network, given a sample of reachable nodes from supply nodes? One of the most critical post-disruption processes after a natural disaster is to quickly determine the damage or failure states of critical infrastructure components. However, this is non-trivial, considering that often only a fraction of components may be accessible or observable after a disruptive event. Past work has looked into inferring failed components given point probes, i.e. with a direct sample of failed components. In contrast, we study the harder problem of inferring failed components given partial information of some `serviceable' reachable nodes and a small sample of point probes, being the first often more practical to obtain. We formulate this novel problem using the Minimum Description Length (MDL) principle, and then present a greedy algorithm that minimizes MDL cost effectively. We evaluate our algorithm on domain-expert simulations of real networks in the aftermath of an earthquake. Our algorithm successfully identify failed components, especially the critical ones affecting the overall system performance.

connectivity query, demand node, node, (16 more...)

arXiv.org Artificial Intelligence

2012.03413

Country:

North America > United States > California (0.14)
North America > United States > Virginia (0.04)
North America > United States > Tennessee > Shelby County (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.48)

Add feedback

Data Science & Machine Learning(Theory+Projects)A-Z 90 HOURS

#artificialintelligenceDec-6-2020, 17:30:37 GMT

Electrification was, without a doubt, the greatest engineering marvel of the 20th century. The electric motor was invented way back in 1821, and the electrical circuit was mathematically analyzed in 1827. But factory electrification, household electrification, and railway electrification all started slowly several decades later. The field of AI was formally founded in 1956. But it's only now--more than six decades later--that AI is expected to revolutionize the way humanity will live and work in the coming decades.

data science, electrification, machine learning, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.85)

Add feedback

GpuShareSat: a SAT solver using the GPU for clause sharing

Prevot, Nicolas

arXiv.org Artificial IntelligenceDec-5-2020

We describe a SAT solver using both the GPU (CUDA) and the CPU with a new clause exchange strategy. The CPU runs a classic multithreaded CDCL SAT solver. EachCPU thread exports all the clauses it learns to the GPU. The GPU makes a heavy usage of bitwise operations. It notices when a clause would have been used by a CPU thread and notifies that thread, in which case it imports that clause. This relies on the GPU repeatedly testing millions of clauses against hundreds of assignments. All the clauses are tested independantly from each other (which allows the GPU massively parallel approach), but against all the assignments at once, using bitwise operations. This allows CPU threads to only import clauses which would have been useful for them. Our solver is based upon glucose-syrup. Experiments show that this leads to a strong performance improvement, with 22 more instances solved on the SAT 2020 competition than glucose-syrup.

aggregate assignment, assignment, solver, (15 more...)

arXiv.org Artificial Intelligence

2012.03119

Genre: Research Report (0.40)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.84)

Add feedback

Mint: MDL-based approach for Mining INTeresting Numerical Pattern Sets

Makhalova, Tatiana, Kuznetsov, Sergei O., Napoli, Amedeo

arXiv.org Artificial IntelligenceNov-30-2020

Pattern mining is well established in data mining research, especially for mining binary datasets. Surprisingly, there is much less work about numerical pattern mining and this research area remains under-explored. In this paper, we propose Mint, an efficient MDL-based algorithm for mining numerical datasets. The MDL principle is a robust and reliable framework widely used in pattern mining, and as well in subgroup discovery. In Mint we reuse MDL for discovering useful patterns and returning a set of non-redundant overlapping patterns with well-defined boundaries and covering meaningful groups of objects. Mint is not alone in the category of numerical pattern miners based on MDL. In the experiments presented in the paper we show that Mint outperforms competitors among which Slim and RealKrimp.

dataset, discretization, mint, (15 more...)

arXiv.org Artificial Intelligence

2011.14843

Country:

North America > United States > Wisconsin (0.04)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.49)

Add feedback