Goto

Collaborating Authors

 Computational Learning Theory


Approximation of the Two-Part MDL Code

arXiv.org Artificial Intelligence

Approximation of the optimal two-part MDL code for given data, through successive monotonically length-decreasing two-part MDL codes, has the following properties: (i) computation of each step may take arbitrarily long; (ii) we may not know when we reach the optimum, or whether we will reach the optimum at all; (iii) the sequence of models generated may not monotonically improve the goodness of fit; but (iv) the model associated with the optimum has (almost) the best goodness of fit. To express the practically interesting goodness of fit of individual models for individual data sets we have to rely on Kolmogorov complexity.


Learnability and the doubling dimension

Neural Information Processing Systems

We prove bounds on the sample complexity of PAC learning in terms of the doubling dimension of this metric. These bounds imply known bounds on the sample complexity of learning halfspaces with respect to the uniform distribution that are optimal up to a constant factor.


Universal Intelligence: A Definition of Machine Intelligence

arXiv.org Artificial Intelligence

A fundamental problem in artificial intelligence is that nobody really knows what intelligence is. The problem is especially acute when we need to consider artificial systems which are significantly different to humans. In this paper we approach this problem in the following way: We take a number of well known informal definitions of human intelligence that have been given by experts, and extract their essential features. These are then mathematically formalised to produce a general measure of intelligence for arbitrary machines. We believe that this equation formally captures the concept of machine intelligence in the broadest reasonable sense. We then show how this formal definition is related to the theory of universal optimal learning agents. Finally, we survey the many other tests and definitions of intelligence that have been proposed for machines.


From Batch to Transductive Online Learning

Neural Information Processing Systems

It is well-known that everything that is learnable in the difficult online setting, where an arbitrary sequences of examples must be labeled one at a time, is also learnable in the batch setting, where examples are drawn independently from a distribution. We show a result in the opposite direction. We give an efficient conversion algorithm from batch to online that is transductive: it uses future unlabeled data. This demonstrates the equivalence between what is properly and efficiently learnable in a batch model and a transductive online model.


Learning from Data of Variable Quality

Neural Information Processing Systems

We initiate the study of learning from multiple sources of limited data, each of which may be corrupted at a different rate. We develop a complete theoryof which data sources should be used for two fundamental problems: estimating the bias of a coin, and learning a classifier in the presence of label noise. In both cases, efficient algorithms are provided for computing the optimal subset of data.



Analysis of a greedy active learning strategy

Neural Information Processing Systems

We abstract out the core search problem of active learning schemes, to better understand the extent to which adaptive labeling can improve sample complexity.We give various upper and lower bounds on the number of labels which need to be queried, and we prove that a popular greedy active learning rule is approximately as good as any other strategy for minimizing this number of labels.



The 2005 AAAI Classic Paper Awards

AI Magazine

Twenty years later that link is firmly established, and the two research communities have largely merged into one. Problem," by Steve Hanks and Drew Mc-or does not hold after a sequence of learning was the "inductive bias" of a The idea is this: Normally, an object learn the target concept--the more in 1986, helped initiate a very constraining the inductive bias, the is unaffected by an action. If a window fruitful integration of a branch of machine less training data needed. Starting in the of PAC learning was being developed, an action. There are clear exceptions, 1950s, with work like Samuels's program which allowed deriving quantitative however, such as the act of closing the that learned strategies for playing bounds on the probability of window. A variety of formal systems checkers, AI researchers had designed successful learning as a function of the have been proposed that would allow and experimented with a number of training examples and the us to infer in the absence of conflicting variety of learning algorithms and complexity of the learner's hypothesis information that the window remains had also developed a number of theoretical space (as measured by its Vapnik-open (or that a polar bear is results, such as convergence Chervonenkis dimension). What white or that a violin has four strings, proofs for perceptrons and "learning Haussler's paper did was help introduce and so on).


Towards Understanding and Harnessing the Potential of Clause Learning

Journal of Artificial Intelligence Research

Efficient implementations of DPLL with the addition of clause learning are the fastest complete Boolean satisfiability solvers and can handle many significant real-world problems, such as verification, planning and design. Despite its importance, little is known of the ultimate strengths and limitations of the technique. This paper presents the first precise characterization of clause learning as a proof system (CL), and begins the task of understanding its power by relating it to the well-studied resolution proof system. In particular, we show that with a new learning scheme, CL can provide exponentially shorter proofs than many proper refinements of general resolution (RES) satisfying a natural property. These include regular and Davis-Putnam resolution, which are already known to be much stronger than ordinary DPLL. We also show that a slight variant of CL with unlimited restarts is as powerful as RES itself. Translating these analytical results to practice, however, presents a challenge because of the nondeterministic nature of clause learning algorithms. We propose a novel way of exploiting the underlying problem structure, in the form of a high level problem description such as a graph or PDDL specification, to guide clause learning algorithms toward faster solutions. We show that this leads to exponential speed-ups on grid and randomized pebbling problems, as well as substantial improvements on certain ordering formulas.