Information Technology
A Framework for Multiple-Instance Learning
Maron, Oded, Lozano-Pérez, Tomás
Multiple-instance learning is a variation on supervised learning, where the task is to learn a concept given positive and negative bags of instances. Each bag may contain many instances, but a bag is labeled positive even if only one of the instances in it falls within the concept. A bag is labeled negative only if all the instances in it are negative. We describe a new general framework, called Diverse Density, for solving multiple-instance learning problems. We apply this framework to learn a simple description of a person from a series of images (bags) containing that person, to a stock selection problem, and to the drug activity prediction problem.
On Efficient Heuristic Ranking of Hypotheses
Chien, Steve A., Stechert, Andre, Mutz, Darren
Voice: (818) 306-6144 FAX: (818) 306-6912 Content Areas: Applications (Stochastic Optimization),Model Selection Algorithms Abstract This paper considers the problem of learning the ranking of a set of alternatives based upon incomplete information (e.g., a limited number of observations). We describe two algorithms for hypothesis ranking and their application for probably approximately correct (PAC) and expected loss (EL) learning criteria. Empirical results are provided to demonstrate the effectiveness of these ranking procedures on both synthetic datasets and real-world data from a spacecraft design optimization problem. 1 INTRODUCTION In many learning applications, the cost of information can be quite high, imposing a requirement that the learning algorithms glean as much usable information as possible with a minimum of data. For example: - In speedup learning, the expense of processing each training example can be significant [Tadepalli921. This paper provides a statistical decision-theoretic framework for the ranking of parametric distributions.
Phase Transitions and the Perceptual Organization of Video Sequences
Estimating motion in scenes containing multiple moving objects remains a difficult problem in computer vision. A promising approach tothis problem involves using mixture models, where the motion of each object is a component in the mixture. However, existing methodstypically require specifying in advance the number of components in the mixture, i.e. the number of objects in the scene.
New Approximations of Differential Entropy for Independent Component Analysis and Projection Pursuit
We derive a first-order approximation of the density of maximum entropy for a continuous 1-D random variable, given a number of simple constraints. This results in a density expansion which is somewhat similar to the classical polynomial density expansions by Gram-Charlier and Edgeworth. Using this approximation of density, an approximation of 1-D differential entropy is derived. The approximation of entropy is both more exact and more robust againstoutliers than the classical approximation based on the polynomial density expansions, without being computationally more expensive. The approximation has applications, for example, in independent component analysis and projection pursuit. 1 Introduction The basic information-theoretic quantity for continuous one-dimensional random variables is differential entropy. The differential entropy H of a scalar random variable X with density f(x) is defined as H(X) - / f(x) log f(x)dx.
Reinforcement Learning for Call Admission Control and Routing in Integrated Service Networks
Marbach, Peter, Mihatsch, Oliver, Schulte, Miriam, Tsitsiklis, John N.
We provide a model of the standard watermaze task, and of a more challenging task involving novel platform locations, in which rats exhibit one-trial learning after a few days of training. The model uses hippocampal place cells to support reinforcement learning, and also, in an integrated manner, to build and use allocentric coordinates. 1 INTRODUCTION
On the Separation of Signals from Neighboring Cells in Tetrode Recordings
Sahani, Maneesh, Pezaris, John S., Andersen, Richard A.
We discuss a solution to the problem of separating waveforms produced by multiple cells in an extracellular neural recording. We take an explicitly probabilistic approach, using latent-variable models of varying sophistication to describe the distribution of waveforms produced by a single cell. The models range from a single Gaussian distribution of waveforms for each cell to a mixture of hidden Markov models. We stress the overall statistical structure of the approach, allowing the details of the generative model chosen to depend on the specific neural preparation.
Comparison of Human and Machine Word Recognition
Schenkel, Markus, Latimer, Cyril, Jabri, Marwan A.
We present a study which is concerned with word recognition rates for heavily degraded documents. We compare human with machine reading capabilities in a series of experiments, which explores the interaction of word/non-word recognition, word frequency and legality of non-words with degradation level. We also study the influence of character segmentation, and compare human performance with that of our artificial neural network model for reading. We found that the proposed computer model uses word context as efficiently as humans, but performs slightly worse on the pure character recognition task. 1 Introduction Optical Character Recognition (OCR) of machine-print document images ·has matured considerably during the last decade. Recognition rates as high as 99.5% have been reported on good quality documents. However, for lower image resolutions (200 Dpl and below), noisy images, images with blur or skew, the recognition rate declines considerably. In bad quality documents, character segmentation is as big a problem as the actual character recognition.
Data-Dependent Structural Risk Minimization for Perceptron Decision Trees
Shawe-Taylor, John, Cristianini, Nello
This paper presents a neural-model of pre-attentive visual processing. The model explains why certain displays can be processed very fast, "in parallel", while others require slower, "serial" processing, in subsequent attentional systems. Our approach stems from the observation that the visual environment is overflowing with diverse information, but the biological information-processing systems analyzing it have a limited capacity [1]. This apparent mismatch suggests that data compression should be performed at an early stage of perception, and that via an accompanying process of dimension reduction, only a few essential features of the visual display should be retained. We propose that only parallel displays incorporate global features that enable fast target detection, and hence they can be processed pre-attentively, with all items (target and dis tractors) examined at once.