Computational Learning Theory
Analysis of a greedy active learning strategy
We abstract out the core search problem of active learning schemes, to better understand the extent to which adaptive labeling can improve sample complexity. We give various upper and lower bounds on the number of labels which need to be queried, and we prove that a popular greedy active learning rule is approximately as good as any other strategy for minimizing this number of labels.
Analysis of a greedy active learning strategy
We abstract out the core search problem of active learning schemes, to better understand the extent to which adaptive labeling can improve sample complexity. We give various upper and lower bounds on the number of labels which need to be queried, and we prove that a popular greedy active learning rule is approximately as good as any other strategy for minimizing this number of labels.
Analysis of a greedy active learning strategy
We abstract out the core search problem of active learning schemes, to better understand the extent to which adaptive labeling can improve sample complexity.We give various upper and lower bounds on the number of labels which need to be queried, and we prove that a popular greedy active learning rule is approximately as good as any other strategy for minimizing this number of labels.
The 2005 AAAI Classic Paper Awards
Mitchell, Tom, Levesque, Hector
Twenty years later that link is firmly established, and the two research communities have largely merged into one. Problem," by Steve Hanks and Drew Mc-or does not hold after a sequence of learning was the "inductive bias" of a The idea is this: Normally, an object learn the target concept--the more in 1986, helped initiate a very constraining the inductive bias, the is unaffected by an action. If a window fruitful integration of a branch of machine less training data needed. Starting in the of PAC learning was being developed, an action. There are clear exceptions, 1950s, with work like Samuels's program which allowed deriving quantitative however, such as the act of closing the that learned strategies for playing bounds on the probability of window. A variety of formal systems checkers, AI researchers had designed successful learning as a function of the have been proposed that would allow and experimented with a number of training examples and the us to infer in the absence of conflicting variety of learning algorithms and complexity of the learner's hypothesis information that the window remains had also developed a number of theoretical space (as measured by its Vapnik-open (or that a polar bear is results, such as convergence Chervonenkis dimension). What white or that a violin has four strings, proofs for perceptrons and "learning Haussler's paper did was help introduce and so on).
Towards Understanding and Harnessing the Potential of Clause Learning
Beame, P., Kautz, H., Sabharwal, A.
Efficient implementations of DPLL with the addition of clause learning are the fastest complete Boolean satisfiability solvers and can handle many significant real-world problems, such as verification, planning and design. Despite its importance, little is known of the ultimate strengths and limitations of the technique. This paper presents the first precise characterization of clause learning as a proof system (CL), and begins the task of understanding its power by relating it to the well-studied resolution proof system. In particular, we show that with a new learning scheme, CL can provide exponentially shorter proofs than many proper refinements of general resolution (RES) satisfying a natural property. These include regular and Davis-Putnam resolution, which are already known to be much stronger than ordinary DPLL. We also show that a slight variant of CL with unlimited restarts is as powerful as RES itself. Translating these analytical results to practice, however, presents a challenge because of the nondeterministic nature of clause learning algorithms. We propose a novel way of exploiting the underlying problem structure, in the form of a high level problem description such as a graph or PDDL specification, to guide clause learning algorithms toward faster solutions. We show that this leads to exponential speed-ups on grid and randomized pebbling problems, as well as substantial improvements on certain ordering formulas.
On the Convergence Speed of MDL Predictions for Bernoulli Sequences
We consider the Minimum Description Length principle for online sequence prediction. If the underlying model class is discrete, then the total expected square loss is a particularly interesting performance measure: (a) this quantity is bounded, implying convergence with probability one, and (b) it additionally specifies a `rate of convergence'. Generally, for MDL only exponential loss bounds hold, as opposed to the linear bounds for a Bayes mixture. We show that this is even the case if the model class contains only Bernoulli distributions. We derive a new upper bound on the prediction error for countable Bernoulli classes. This implies a small bound (comparable to the one for Bayes mixtures) for certain important model classes. The results apply to many Machine Learning tasks including classification and hypothesis testing. We provide arguments that our theorems generalize to countable classes of i.i.d. models.
Categorization Under Complexity: A Unified MDL Account of Human Learning of Regular and Irregular Categories
We present an account of human concept learning-that is, learning of categories from examples-based on the principle of minimum description length (MDL). In support of this theory, we tested a wide range of two-dimensional concept types, including both regular (simple) and highly irregular (complex) structures, and found the MDL theory to give a good account of subjects' performance. This suggests that the intrinsic complexity ofa concept (that is, its description -length) systematically influences its leamability.