Bayesian Learning
Learning Planar Ising Models
Johnson, Jason K., Oyen, Diane, Chertkov, Michael, Netrapalli, Praneeth
Inference and learning of graphical models are both well-studied problems in statistics and machine learning that have found many applications in science and engineering. However, exact inference is intractable in general graphical models, which suggests the problem of seeking the best approximation to a collection of random variables within some tractable family of graphical models. In this paper, we focus on the class of planar Ising models, for which exact inference is tractable using techniques of statistical physics. Based on these techniques and recent methods for planarity testing and planar embedding, we propose a simple greedy algorithm for learning the best planar Ising model to approximate an arbitrary collection of binary random variables (possibly from sample data). Given the set of all pairwise correlations among variables, we select a planar graph and optimal planar Ising model defined on this graph to best approximate that set of correlations. We demonstrate our method in simulations and for the application of modeling senate voting records.
Cheaper and Better: Selecting Good Workers for Crowdsourcing
Crowdsourcing provides a popular paradigm for data collection at scale. We study the problem of selecting subsets of workers from a given worker pool to maximize the accuracy under a budget constraint. One natural question is whether we should hire as many workers as the budget allows, or restrict on a small number of top-quality workers. By theoretically analyzing the error rate of a typical setting in crowdsourcing, we frame the worker selection problem into a combinatorial optimization problem and propose an algorithm to solve it efficiently. Empirical results on both simulated and real-world datasets show that our algorithm is able to select a small number of high-quality workers, and performs as good as, sometimes even better than, the much larger crowds as the budget allows.
A scaled gradient projection method for Bayesian learning in dynamical systems
Bonettini, Silvia, Chiuso, Alessandro, Prato, Marco
A crucial task in system identification problems is the selection of the most appropriate model class, and is classically addressed resorting to cross-validation or using order selection criteria based on asymptotic arguments. As recently suggested in the literature, this can be addressed in a Bayesian framework, where model complexity is regulated by few hyperparameters, which can be estimated via marginal likelihood maximization. It is thus of primary importance to design effective optimization methods to solve the corresponding optimization problem. If the unknown impulse response is modeled as a Gaussian process with a suitable kernel, the maximization of the marginal likelihood leads to a challenging nonconvex optimization problem, which requires a stable and effective solution strategy. In this paper we address this problem by means of a scaled gradient projection algorithm, in which the scaling matrix and the steplength parameter play a crucial role to provide a meaningful solution in a computational time comparable with second order methods. In particular, we propose both a generalization of the split gradient approach to design the scaling matrix in the presence of box constraints, and an effective implementation of the gradient and objective function. The extensive numerical experiments carried out on several test problems show that our method is very effective in providing in few tenths of a second solutions of the problems with accuracy comparable with state-of-the-art approaches. Moreover, the flexibility of the proposed strategy makes it easily adaptable to a wider range of problems arising in different areas of machine learning, signal processing and system identification.
Falling Rule Lists
Falling rule lists are classification models consisting of an ordered list of if-then rules, where (i) the order of rules determines which example should be classified by each rule, and (ii) the estimated probability of success decreases monotonically down the list. These kinds of rule lists are inspired by healthcare applications where patients would be stratified into risk sets and the highest at-risk patients should be considered first. We provide a Bayesian framework for learning falling rule lists that does not rely on traditional greedy decision tree learning methods.
Microscopic Advances with Large-Scale Learning: Stochastic Optimization for Cryo-EM
Punjani, Ali, Brubaker, Marcus A.
Determining the 3D structures of biological molecules is a key problem for both biology and medicine. Electron Cryomicroscopy (Cryo-EM) is a promising technique for structure estimation which relies heavily on computational methods to reconstruct 3D structures from 2D images. This paper introduces the challenging Cryo-EM density estimation problem as a novel application for stochastic optimization techniques. Structure discovery is formulated as MAP estimation in a probabilistic latent-variable model, resulting in an optimization problem to which an array of seven stochastic optimization methods are applied. The methods are tested on both real and synthetic data, with some methods recovering reasonable structures in less than one epoch from a random initialization. Complex quasi-Newton methods are found to converge more slowly than simple gradient-based methods, but all stochastic methods are found to converge to similar optima. This method represents a major improvement over existing methods as it is significantly faster and is able to converge from a random initialization.
334 / EXPERT SYSTEMS AND Al APPLICATIONS
ABSTRACT Prospector is a computer consultant system intended to aid geologists in evaluating the favorability of an exploration site or region for occurrences of ore deposits of particular types. Knowledge about a particular type of ore deposit is encoded in a computational model representing observable geological features and the relative significance thereof. We describe the form of models in Prospector, focussing on inference networks of geological assertions and the Bayesian propagation formalism used to represent the judgmental reasoning process of the economic geologist who serves as model designer. Following the initial design of a model, simple performance evaluation techniques are used to assess the extent to which the performance of the model reflects faithfully the intent of the model designer. These results identify specific portions of the model that might benefit from "fine tuning", and establish priorities for such revisions. This description of the Prospector system and the model design process serves to illustrate the process of transferring human expertise about a subjective domain into a mechanical realization. I. INTRODUCTION In an increasingly complex and specialized world, human expertise about diverse subjects spanning scientific, economic, social, and political issues plays an increasingly important role in the functioning of all kinds of organizations. Although computers have become indispensable tools in many endeavors, we continue to rely heavily on the human expert's ability to identify and synthesize diverse factors, to form judgments, evaluate alternatives, and make decisions -- in sum, to apply his or her years of experience to the problem at hand. This is especially valid with regard to domains that are not easily amenable to precise scientific formulations, i.e., to domains in which experience and subjective judgment plays a major role.
SESSION 1 PAPER CONDITIONAL PROBABILITY COMPUTING IN A NERVOUS SYSTEM
Dr. Uttley took an Honours degree in Mathematics at King's College, London where he also took a degree in Psychology and did post-graduate research in Visual Perception. At the Royal Radar establishment he designed and built analogue and digital computers. For the last five years Dr. Uttley has been working on theories of computing in the nervous system. ABSTRACT IN two previous papers it has been suggested that two particular mathematical principles may underlie the organization of nervous systems; the first is that of classification (Uttley, 1954, ref.. 13) and the second is that of. The suggestion is based on the similarity of behaviour of these formal systems and or animals. The design of classification computers is discussed in the first paper; the design of conditional probability computers Is discussed in a third paper (Uttley, 1958, ref. 15); in both papers working models are described. FUrther reference to these papers will be by date only. It is the aim of the present paper to consider whether the two principles might operate in nervous systems. Mere are four requirements for the principle of classification to operate in an area of a nervous system. Firstly, In that area, signalling must be binary; this would be the case if, for example, the impulse frequency were at either a very low rate or at a maximal rate, or if signalling were In terms of standard volleys; in general, if the fibre activity were in one of only two states. The second requirement Is that the fibres which form the input to the area be connected to neurons In as many different ways as possible; there are many areas in which this condition is met. The third requirement Is that more than one synapse of a neuron must become active for it to fire; this appears to be met. The fourth requirement is that there shall be some way of delaying signals for periods of the order of seconds. A block of isolated cortex does remain active for such periods when stimulated briefly so in this way the requirement might be met. If these conditions are all met each neuron will indicate, by firing, the occurrerze of a particular spatio-temporal pattern of activity in the input to the system.
MACHINE INTELLIGENCE 13
The two outstanding figures in the history of computer science are Alan Turing and John von Neumann, and they shared the view that logic was the key to understanding and automating computation. In particular, it was Turing who gave us in the mid-1930s the fundamental analysis, and the logical definition, of the concept of'computability by machine' and who discovered the surprising and beautiful basic fact that there exist universal machines which by suitable programming can be made to t This essay is an expanded and revised version of one entitled The Role of Logic in Computer Science and Artificial Intelligence, which was completed in January 1992 (and was later published in the Proceedings of the Fifth Generation computer Systems 1992 Conference). Since completing that essay I have had the benefit of extremely helpful discussions on many of the details with Professor Donald Michie and Professor I. J. Good, both of whom knew Turing well during the war years at Bletchley Park. Professor J. A. N. Lee, whose knowledge of the literature and archives of the history of computing is encyclopedic, also provided additional information, some of which is still unpublished. Further light has very recently been shed on the von Neumann side of the story by Norman Macrae's excellent biography John von Neumann (Macrae 1992). Accordingly, it seemed appropriate to undertake a more complete and thorough version of the FGCS'92 essay, focussing somewhat more on the interesting historical and biographical issues. I am grateful to Donald Michie and Stephen Muggleton for inviting me to contribute such a'second edition' to the present volume, and I would also like to thank the Institute for New Computer Technology (ICOT) for kind permission to make use of the FGCS'92 essay in this way. 1 LOGIC, COMPUTERS, TURING, AND VON NEUMANN