Europe
Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems
Jaakkola, Tommi, Singh, Satinder P., Jordan, Michael I.
Increasing attention has been paid to reinforcement learning algorithms inrecent years, partly due to successes in the theoretical analysis of their behavior in Markov environments. If the Markov assumption is removed, however, neither generally the algorithms nor the analyses continue to be usable. We propose and analyze a new learning algorithm to solve a certain class of non-Markov decision problems. Our algorithm applies to problems in which the environment is Markov, but the learner has restricted access to state information. The algorithm involves a Monte-Carlo policy evaluationcombined with a policy improvement method that is similar to that of Markov decision problems and is guaranteed to converge to a local maximum. The algorithm operates in the space of stochastic policies, a space which can yield a policy that performs considerablybetter than any deterministic policy. Although the space of stochastic policies is continuous-even for a discrete action space-our algorithm is computationally tractable.
Hyperparameters Evidence and Generalisation for an Unrealisable Rule
Using a statistical mechanical formalism we calculate the evidence, generalisation error and consistency measure for a linear perceptron trainedand tested on a set of examples generated by a non linear teacher. The teacher is said to be unrealisable because the student can never model it without error. Our model allows us to interpolate between the known case of a linear teacher, and an unrealisable, nonlinearteacher. A comparison of the hyperparameters which maximise the evidence with those that optimise the performance measuresreveals that, in the nonlinear case, the evidence procedure is a misleading guide to optimising performance. Finally, we explore the extent to which the evidence procedure is unreliable and find that, despite being sub-optimal, in some circumstances it might be a useful method for fixing the hyperparameters. 1 INTRODUCTION The analysis of supervised learning or learning from examples is a major field of research within neural networks.
Higher Order Statistical Decorrelation without Information Loss
Deco, Gustavo, Brauer, Wilfried
A neural network learning paradigm based on information theory is proposed asa way to perform in an unsupervised fashion, redundancy reduction among the elements of the output layer without loss of information fromthe sensory input. The model developed performs nonlinear decorrelation up to higher orders of the cumulant tensors and results in probabilistically independent components of the output layer. This means that we don't need to assume Gaussian distribution neither at the input nor at the output. The theory presented is related to the unsupervised-learning theoryof Barlow, which proposes redundancy reduction as the goal of cognition. When nonlinear units are used nonlinear principal componentanalysis is obtained.
Neural Network Ensembles, Cross Validation, and Active Learning
Krogh, Anders, Vedelsby, Jesper
It is well known that a combination of many different predictors can improve predictions. Inthe neural networks community "ensembles" of neural networks has been investigated by several authors, see for instance [1, 2, 3]. Most often the networks in the ensemble are trained individually and then their predictions are combined. This combination is usually done by majority (in classification) or by simple averaging (inregression), but one can also use a weighted combination of the networks.
A solvable connectionist model of immediate recall of ordered lists
A model of short-term memory for serially ordered lists of verbal stimuli is proposed as an implementation of the'articulatory loop' thought to mediate this type of memory (Baddeley, 1986). The model predicts the presence of a repeatable time-varying'context' signal coding the timing of items' presentation in addition to a store of phonological information and a process of serial rehearsal. Items are associated with context nodes and phonemes by Hebbian connections showing both short and long term plasticity. Items are activated by phonemic input during presentation and reactivated by context and phonemic feedback during output. Serial selection of items occurs via a winner-take-all interaction amongst items, with the winner subsequently receiving decaying inhibition. An approximate analysis of error probabilities due to Gaussian noise during output is presented. The model provides an explanatory account of the probability of error as a function of serial position, list length, word length, phonemic similarity, temporal grouping, item and list familiarity, and is proposed as the starting point for a model of rehearsal and vocabulary acquisition.
Direction Selectivity In Primary Visual Cortex Using Massive Intracortical Connections
Suarez, Humbert, Koch, Christof, Douglas, Rodney
Almost all models of orientation and direction selectivity in visual cortex are based on feedforward connection schemes, where geniculate inputprovides all excitation to both pyramidal and inhibitory neurons. The latter neurons then suppress the response of the former fornon-optimal stimuli. However, anatomical studies show that up to 90 % of the excitatory synaptic input onto any cortical cellis provided by other cortical cells. The massive excitatory feedback nature of cortical circuits is embedded in the canonical microcircuit of Douglas &. Martin (1991). We here investigate analytically andthrough biologically realistic simulations the functioning of a detailed model of this circuitry, operating in a hysteretic mode. In the model, weak geniculate input is dramatically amplified byintracortical excitation, while inhibition has a dual role: (i) to prevent the early geniculate-induced excitation in the null direction and(ii) to restrain excitation and ensure that the neurons fire only when the stimulus is in their receptive-field.
Montreal Wrap-Up
Randy Davis announced the appointment of six new program managers at ARPA. At IJCAI-95, Randall Davis assumed the office of president of the American For many attending the Fourteenth for consideration this year," noted Association for Artificial Intelligence International Joint Conference on Ray Perrault of SRI International, (AAAI). Davis is a professor of Artificial Intelligence (IJCAI-95), the chair of the conference. "This is more electrical engineering and computer most difficult problem was choosing than at IJCAI-93 and at recent science and associate director of the which session to attend in the rich, National Conferences on AI in the AI Lab at the Massachusetts Institute varied program. Davis succeeds data-mining application from rate was under 25 percent, showing Barbara Grosz, Gordon McKay professor the U.S. Department of the Treasury that there is a great deal of work of computer science in the Division that identifies potential money laundering going on, and the scientific standard of Applied Sciences at Harvard to a small mobile LEG0 robot of IJCAI matches or exceeds that of University.
The Workshop on Computational Dialectics
Surely, scientific arguments Still, a full literature search of citations They are trivial, that is, when compared have their own special logic. of Rescher's 1977 monograph, to the defeasibility of open-textured Cavalli-Sforza has for a while been Dialectics, reveals no useful formal concepts, the logic of which interested in Toulmin's own attempts extension or clarification of the logical remains unanalyzed (says McCarty, to apply his work on argument to system prior to Brewka.
The AI's Half-Century
"How We Know Universals: The Perception Their first paper made many intellectual waves--which are still spreading, 50 years later. They had claimed that the truth or falsity of any (computable) proposition could, in with AI, for it's difficult to say just principle, be computed by a simple type of The future of psychology, they good a date as any, however, is 1943--almost said, consisted of the design of various sorts exactly half a century ago. This In that year, Warren McCulloch (a psychiatrist, novel methodology, and the nascent technology cybernetician, philosopher, and poet) associated with it, promised to show just and Walter Pitts (a research student in mathematics) how mind is grounded in mechanism. Much of this was "logical" in nature result was a heady brew, which explicitly and developed into what's known as classical, promised to revolutionize psychology and or symbolic, AI. But some was what is nowadays philosophy--and which, in the event, revolutionized called connectionist, studying networks technology too. In the late 1980s, however, it McCulloch and Pitts' paper ("A Logical Calculus blossomed--hitting the newsstands with of the Ideas Immanent in Nervous rash promises of "brainlike" computers just Activity") concentrated on how propositions around the corner. But both these forms of AI expressible in logic could be computed by share the same historical roots. Those nets consisted of So much for pedigree. But does a mere halfcentury cells passing inhibitory and excitatory messages of work count as a pedigree? Might it between them and acting as what computer rather be a mere blip, an unfortunate academic scientists (soon afterwards) called "and-mutation with no real intellectual fitness?
Development of Self-Maintenance Photocopiers
Shimomura, Yoshiki, Tanigawa, Sadao, Umeda, Yasushi, Tomiyama, Tetsuo
The traditional reliability design methods are imperfect because the designed systems aim at fewer faults, but once a fault happens, the systems might hard fail. To solve this problem, we present a self-maintenance machine (SMM), one that can maintain its functions flexibly even though faults occur. To achieve the capabilities of diagnosing and repair planning, a model-based approach that uses qualitative physics was proposed. Regarding the repair-executing capability, control-type repair strategy was followed. A prototype of the SMM was developed, and it succeeded in maintaining its functions if the structure did not change. However, the prototype revealed the following problems when its reasoning system was used with a commercial product as embedded software: (1) poor performance of the reasoning system, (2) system size that was too large, (3) low adaptability to environmental changes, and (4) roughness of qualitative repair operations. To solve these problems, we proposed new reasoning method based on virtual cases and fuzzy qualitative values. This methodology is one of knowledge compilation, which gives better reasoning performance and can deal with real-world applications such as the SMM. By using this method, we finally developed a commercial photocopier that has self-maintainability and is more robust against faults. The commercial version has been supplied worldwide as a product of Mita Industrial Co., Ltd., since April 1994.