Education
Optimal computational and statistical rates of convergence for sparse nonconvex learning problems
Wang, Zhaoran, Liu, Han, Zhang, Tong
We provide theoretical analysis of the statistical and computational properties of penalized $M$-estimators that can be formulated as the solution to a possibly nonconvex optimization problem. Many important estimators fall in this category, including least squares regression with nonconvex regularization, generalized linear models with nonconvex regularization and sparse elliptical random design regression. For these problems, it is intractable to calculate the global solution due to the nonconvex formulation. In this paper, we propose an approximate regularization path-following method for solving a variety of learning problems with nonconvex objective functions. Under a unified analytic framework, we simultaneously provide explicit statistical and computational rates of convergence for any local solution attained by the algorithm. Computationally, our algorithm attains a global geometric rate of convergence for calculating the full regularization path, which is optimal among all first-order algorithms. Unlike most existing methods that only attain geometric rates of convergence for one single regularization parameter, our algorithm calculates the full regularization path with the same iteration complexity. In particular, we provide a refined iteration complexity bound to sharply characterize the performance of each stage along the regularization path. Statistically, we provide sharp sample complexity analysis for all the approximate local solutions along the regularization path. In particular, our analysis improves upon existing results by providing a more refined sample complexity bound as well as an exact support recovery result for the final estimator. These results show that the final estimator attains an oracle statistical property due to the usage of nonconvex penalty.
Online Nonparametric Regression with General Loss Functions
Rakhlin, Alexander, Sridharan, Karthik
This paper establishes minimax rates for online regression with arbitrary classes of functions and general losses. We show that below a certain threshold for the complexity of the function class, the minimax rates depend on both the curvature of the loss function and the sequential complexities of the class. Above this threshold, the curvature of the loss does not affect the rates. Furthermore, for the case of square loss, our results point to the interesting phenomenon: whenever sequential and i.i.d. empirical entropies match, the rates for statistical and online learning are the same. In addition to the study of minimax regret, we derive a generic forecaster that enjoys the established optimal rates. We also provide a recipe for designing online prediction algorithms that can be computationally efficient for certain problems. We illustrate the techniques by deriving existing and new forecasters for the case of finite experts and for online linear regression.
Mechanisation of Thought Processes
Biology seems to be a science in its own right, or set of sciences having common aims, and so it should have its own language and explanatory concepts; yet when any specifically biological concept is suggested and used as an explanatory concept it seems to be unsatisfactory and even mystical. There are many biological concepts of this kind: Purpose, Drive, elan vital, Entelechy, Gestalten.* Physicists and engineers seem, on the other hand, to have clearly defined concepts having great power within biology.
CABARET: rule interpretation in a hybrid architecture
We focus on realistic, complex domains where the concepts, terms and predicates used by domain rules or by rule-based models are not well-defined. Often, in such inherently ill-defined domains the rules do not encompass all the situations they are asked or assumed to cover, admit tacit exceptions, or can be contradicted and annulled by other rules. Interpretation is therefore required of the terms and predicates used. The law is a prototypical example of such an area, where terms used in legal statutes are not completely defined by legal regulations. The use of case-based reasoning (CBR) to complement and supplement other types of reasoning involves many computational questions of system architecture and control. The key focus of this work is how and when to interleave CBR with other modes of reasoning in the context of applying a rule or model to a new set of facts in light of a corpus of cases of past application. The goal is to generate an explanation or argument as to how the new fact situation might be interpreted. In particular, we report on a system called CABARET (CAse-BAsed REasoning Tool), a hybrid architecture we have built to study and experiment with these issues.
COGNITIVE SCIENCE 2 361 383 1978
He knows about examples and heuristics and how they are related. He has a sense of what to use and when to use it, and what is worth remembering. He has an intuitive feeling for the subject, how it hangs together, and how it relates to other theories. He knows how not to be swamped by details, but also to reference them when he needs them. This paper is concerned with this important extra-logical knowledge that is often outside of traditional discussions in mathematics.