You can probably use deep learning even if your data isn't that big


Deep learning models are complex and tricky to train, and I had a hunch that lack of model convergence/difficulties training probably explained the poor performance, not overfitting. We recreated python versions of the Leekasso and MLP used in the original post to the best of our ability, and the code is available here. The MLP used in the original analysis still looks pretty bad for small sample sizes, but our neural nets get essentially perfect accuracy for all sample sizes. A lot of parameters are problem specific (especially the parameters related to SGD) and poor choices will result in misleadingly bad performance.

Adaptive Processes and Control

Classics (Collection 2)

"... software technology has promulgated a series of failed promises.... In bugs are pervasive, and there are no robust platforms underneath." Here we present a number of Elementary Adaptive Modules (EAMs), treating them as the basic building blocks of adaptive agent systems, with a discussion of their use, their control, and their behaviors under different conditions; we will also discuss the host of problems that we expect to run into. We want to explore how to write software that can improve itself, and keep improving itself, far beyond mere simple programmed adaptation of pre-specified parameters. We want to test heuristically how to build and use hierarchical adaptive control structures in software.

Report 77-39.pdf

Classics (Collection 2)

Department of Electrical Engineering, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, 24601. 1 /Introduction Giving a machine the ability to learn, adapt, organize or repair itself are among the oldest and most ambitious goals of computer science. In the early days of computing, these goals were central to the new discipline called cybernetics [126], [2]. Over the past two decades, progress toward these goals has come from a variety of fields - notably computer science, psychology, adaptive control theory, pattern recognition, and philosophy. Substantial progress has been made in developing techniques for machine learning in highly restricted environments. Each of these programs, however, is tailored to its particular task, taking advantage of particular assumptions and characteristics associated with its domain.

Report 77 14 Stanford KSL

Classics (Collection 2)

A Model For Learning Systems STAN-CS-77-605 Heuristic Programming Project Memo 77-14 Reid G. Smith, Tom M. Mitchell Richard A. Chestek and Bruce G. Buchanan ABSTRACT A model for learnina systems is presented, and representative Al, pattern recognition, and control systems are discussed in terms of its framework. The model details the functional components felt to be essential for any learning system, independent of the techniques used for its construction, and the specific environment In which it operates. These components are performance element, instance selector, critic, learning element, blackboard, and world model. Consideration of learning system design leads naturally to the concept of a layered system, each layer operating at a different level of abstraction. The views and conclusions contained in this document are those of the authors and should not be interpreted as necessarily representing the official policies, either express or implied, of the Defense Advanced Research ...


Classics (Collection 2)

As we began developing the first few rules for MYCIN, it became clear that the rules we were obtaining from our collaborating experts differed from DENDRAL's situation-action rules in an important way--the inferences described were often uncertain. Cohen and Axline used words such as "suggests" or "lends credence to" in describing the effect of a set of observations on the corresponding conclusion. It seemed clear that we needed to handle probabilistic statements in our rules and to develop a mechanism for gathering evidence for and against a hypothesis when two or more relevant rules were successfully executed. It is interesting to speculate on why this problem did not arise in the DENDRAL domain. In retrospect, we suspect it is related to the inherent complexity of biological as opposed to artificial systems.