Goto

Collaborating Authors

 Rote Learning


A Closer Look at Memorization in Deep Networks

arXiv.org Machine Learning

We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness. While deep networks are capable of memorizing noise data, our results suggest that they tend to prioritize learning simple patterns first. In our experiments, we expose qualitative differences in gradient-based optimization of deep neural networks (DNNs) on noise vs. real data. We also demonstrate that for appropriately tuned explicit regularization (e.g., dropout) we can degrade DNN training performance on noise datasets without compromising generalization on real data. Our analysis suggests that the notions of effective capacity which are dataset independent are unlikely to explain the generalization performance of deep networks when trained with gradient based methods because training data itself plays an important role in determining the degree of memorization.


Latent Geometry and Memorization in Generative Models

arXiv.org Machine Learning

It can be difficult to tell whether a trained generative model has learned to generate novel examples or has simply memorized a specific set of outputs. In published work, it is common to attempt to address this visually, for example by displaying a generated example and its nearest neighbor(s) in the training set (in, for example, the L2 metric). As any generative model induces a probability density on its output domain, we propose studying this density directly. We first study the geometry of the latent representation and generator, relate this to the output density, and then develop techniques to compute and inspect the output density. As an application, we demonstrate that "memorization" tends to a density made of delta functions concentrated on the memorized examples. We note that without first understanding the geometry, the measurement would be essentially impossible to make.


Ablow: Got kids? Apologize

FOX News

Nearly 50 million students are now returning to classrooms--from kindergarten through 12th grade. They will spend approximately eight hours a day at school and additional hours doing homework. They will be educated, in public schools alone, by the equivalent of over 3 million full-time teachers. And they will, with rare exception, learn a dismal fraction of what they ought to be learning to be creative, confident and critical thinkers about themselves and the world around them. As a parent myself, I literally apologized to each of my children--and not just once--for the fact that so much of their time as grade school and junior high school and high school students (even at private school) was being spent on memorization, regurgitation and rote learning that amounted to busy work and the warehousing of them, physically and mentally.


Global education experts urge Japan to look beyond rote learning

The Japan Times

DUBAI – The teaching methods of Kazuya Takahashi, 35, using Lego blocks and speaking entirely in English, may not be the norm in the Japanese education system. But on a global level, the educator, who teaches at the Kogakuin junior high and high schools in Hachioji, western Tokyo, is considered ahead of the game and has won recognition for his efforts to promote global citizenship. His methods may provide clues as to where education should be heading in Japan, a nation often criticized for focusing more on cramming knowledge rather than encouraging critical thinking. At the Global Education and Skills Forum in Dubai, which ran for two days from March 12, Takahashi gave a presentation as one of the 10 finalists for the Global Teacher Prize, known in the industry as the Nobel Prize in education. The event was attended by around 1,600 people from 110 nations.


Machine Intelligence 3

Classics

Note: PDF of full volume downloadable by clicking on title above (26 MB). Selected individual chapters available from the links below. CONTENTSINTRODUCTION MATHEMATICAL FOUNDATIONS1 The morphology of prex—an essay in meta-algorithmics. J. LAS KS 32 Program schemata. M. S. PATE RSON 193 Language definition and compiler validation. J. J. FLORENTIN 334 Placing trees in lexicographic order. H. I.S COINS 43 THEOREM PROVING5 A new look at mathematics and its mechanization. B. M ELTZER 636 Some notes on resolution strategies. B. MELTZER 717 The generalized resolution principle. J. A. ROBINSON 778 Some tree-paring strategies for theorem proving. D.LUCKHAM 959 Automatic theorem proving with equality substitutions andmathematical induction. J. L. D ARLINGTON 113 MACHINE LEARNING AND HEURISTIC PROGRAMMING10 On representations of problems of reasoning about actions.S.AMAREL 13111 Descriptions. E.W.ELCOCK 17312 Kalah on Atlas. A.G.BELL 18113 Experiments with a pleasure-seeking automaton: J. E. DORAN 19514 Collective behaviour and control problems. V.I.VARSHAVSKY 217 MAN—MACHINE INTERACTION15 A comparison of heuristic, interactive, and unaided methods ofsolving a shortest-route problem. D.MICHIE, J. G. FLEMING andJ. V.OLDFIELD 24516 Interactive programming at Carnegie Tech. A.H.BOND 25717 Maintenance of large computer systems—the engineer's assistant.M.H.J.BAYLIS 269 COGNITIVE PROCESSES: METHODS AND MODELS18 The syntactic analysis of English by machine. J.P.THORNE,P.BRATLEY and H.DEWAR 28119 The adaptive memorization of sequences. H.C.LONOUETHIGGINSand A.ORTONY 311 PATTERN RECOGNITION20 An application of Graph Theory in pattern recognition.C.J.HILDITCH 325 PROBLEM-ORIENTED LANGUAGES21 Some semantics for data structures. D. PARK 35122 Writing search algorithms in functional form. R.M.BURSTALL 37323 Assertions: programs written without specifying unnecessaryorder. J.M.FOSTER 38724 The design philosophy of Pop-2. R.J.POPPLESTONE 393 INDEX 403 Machine Intelligence Workshop


The simulation of verbal learning behavior

Classics

The purpose of this report is to describe in detail an informationProcessing model of elementary human symbolic learning processes. Thismodel is realized by a computer program called the Elementary Perceiverand Memorizer (EPAM).The EPAM program is the precise statement of an information processingtheory of verbal learning that provides an alternative to other verballearning theories which have been proposed.1 It is the result of an attemptto state quite precisely a parsimonious and plausible mechanism sufficientto account for the rote learning of nonsense syllables. The criticalevaluation of EPAM must ultimately depend not upon the interest whichit may have as a learning machine, but upon its ability to explain andPredict the phenomena of verbal learning. Proceedings of the Western Joint Computer Conference, 1961, 19:121-132. Reprinted in Feigenbaum & Feldman, Computers and Thought (1963).