AITopics | Technology

We consider the problem of parameter estimation using weakly supervised datasets, where a training sample consists of the input and a partially specified annotation, which we refer to as the output. The missing information in the annotation is modeled using latent variables. Previous methods overburden a single distribution with two separate tasks: (i) modeling the uncertainty in the latent variables during training; and (ii) making accurate predictions for the output and the latent variables during testing. We propose a novel framework that separates the demands of the two tasks using two distributions: (i) a conditional distribution to model the uncertainty of the latent variables for a given input-output pair; and (ii) a delta distribution to predict the output and the latent variables for a given input. During learning, we encourage agreement between the two distributions by minimizing a loss-based dissimilarity coefficient. Our approach generalizes latent SVM in two important ways: (i) it models the uncertainty over latent variables instead of relying on a pointwise estimate; and (ii) it allows the use of loss functions that depend on latent variables, which greatly increases its applicability. We demonstrate the efficacy of our approach on two challenging problems---object detection and action detection---using publicly available datasets.

artificial intelligence, latent variable, machine learning, (15 more...)

1206.4636

Country:

Europe > France (0.05)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)

Eban, Elad, Birnbaum, Aharon, Shalev-Shwartz, Shai, Globerson, Amir

Learning the Experts for Online Sequence Prediction

Online sequence prediction is the problem of predicting the next element of a sequence given previous elements. This problem has been extensively studied in the context of individual sequence prediction, where no prior assumptions are made on the origin of the sequence. Individual sequence prediction algorithms work quite well for long sequences, where the algorithm has enough time to learn the temporal structure of the sequence. However, they might give poor predictions for short sequences. A possible remedy is to rely on the general model of prediction with expert advice, where the learner has access to a set of $r$ experts, each of which makes its own predictions on the sequence. It is well known that it is possible to predict almost as well as the best expert if the sequence length is order of $\log(r)$. But, without firm prior knowledge on the problem, it is not clear how to choose a small set of {\em good} experts. In this paper we describe and analyze a new algorithm that learns a good set of experts using a training set of previously observed sequences. We demonstrate the merits of our approach by applying it on the task of click prediction on the web.

artificial intelligence, machine learning, sequence, (18 more...)

1206.4604

Country:

Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Ravanbakhsh, Siamak, Yu, Chun-Nam, Greiner, Russell

A Generalized Loop Correction Method for Approximate Inference in Graphical Models

Belief Propagation (BP) is one of the most popular methods for inference in probabilistic graphical models. BP is guaranteed to return the correct answer for tree structures, but can be incorrect or non-convergent for loopy graphical models. Recently, several new approximate inference algorithms based on cavity distribution have been proposed. These methods can account for the effect of loops by incorporating the dependency between BP messages. Alternatively, region-based approximations (that lead to methods such as Generalized Belief Propagation) improve upon BP by considering interactions within small clusters of variables, thus taking small loops within these clusters into account. This paper introduces an approach, Generalized Loop Correction (GLC), that benefits from both of these types of loop correction. We show how GLC relates to these two families of inference methods, then provide empirical evidence that GLC works effectively in general, and can be significantly more accurate than both correction schemes.

artificial intelligence, cavity distribution, graph, (15 more...)

1206.4654

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)

Latent Collaborative Retrieval

Weston, Jason, Wang, Chong, Weiss, Ron, Berenzweig, Adam

Retrieval tasks typically require a ranking of items given a query. Collaborative filtering tasks, on the other hand, learn to model user's preferences over items. In this paper we study the joint problem of recommending items to a user with respect to a given query, which is a surprisingly common task. This setup differs from the standard collaborative filtering one in that we are given a query x user x item tensor for training instead of the more traditional user x item matrix. Compared to document retrieval we do have a query, but we may or may not have content features (we will consider both cases) and we can also take account of the user's profile. We introduce a factorized model for this new task that optimizes the top-ranked items returned for the given query and user. We report empirical results where it outperforms several baselines.

artificial intelligence, machine learning, natural language, (16 more...)

1206.4603

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry:

Leisure & Entertainment (0.68)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.72)

Machine Learning that Matters

Wagstaff, Kiri

Much of current machine learning (ML) research has lost its connection to problems of import to the larger world of science and society. From this perspective, there exist glaring limitations in the data sets we investigate, the metrics we employ for evaluation, and the degree to which results are communicated back to their originating domains. What changes are needed to how we conduct research to increase the impact that ML has? We present six Impact Challenges to explicitly focus the field's energy and attention, and we discuss existing obstacles that must be addressed. We aim to inspire ongoing discussion and focus on ML that matters.

application, artificial intelligence, machine learning, (15 more...)

1206.4656

Country:

South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (1.00)
Government (0.69)
Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Takabatake, Kazuya, Akaho, Shotaro

Constraint-free Graphical Model with Fast Learning Algorithm

arXiv.org Machine LearningJun-17-2012

In this paper, we propose a simple, versatile model for learning the structure and parameters of multivariate distributions from a data set. Learning a Markov network from a given data set is not a simple problem, because Markov networks rigorously represent Markov properties, and this rigor imposes complex constraints on the design of the networks. Our proposed model removes these constraints, acquiring important aspects from the information geometry. The proposed parameter- and structure-learning algorithms are simple to execute as they are based solely on local computation at each node. Experiments demonstrate that our algorithms work appropriately.

artificial intelligence, firing process, machine learning, (18 more...)

1206.3721

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report (0.40)

Industry:

Automobiles & Trucks > Manufacturer (0.47)
Banking & Finance > Trading (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)

arXiv.org Machine LearningJun-15-2012

On the Cover-Hart Inequality: What's a Sample of Size One Worth?

Gneiting, Tilmann

Bob predicts a future observation based on a sample of size one. Alice can draw a sample of any size before issuing her prediction. How much better can she do than Bob? Perhaps surprisingly, under a large class of loss functions, which we refer to as the Cover-Hart family, the best Alice can do is to halve Bob's risk. In this sense, half the information in an infinite sample is contained in a sample of size one. The Cover-Hart family is a convex cone that includes metrics and negative definite functions, subject to slight regularity conditions. These results may help explain the small relative differences in empirical performance measures in applied classification and forecasting problems, as well as the success of reasoning and learning by analogy in general, and nearest neighbor techniques in particular.

artificial intelligence, loss function, machine learning, (17 more...)

1206.3381

Country: North America > United States > New York (0.05)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Calimeri, Francesco, Ianni, Giovambattista, Ricca, Francesco

The third open Answer Set Programming competition

arXiv.org Artificial IntelligenceJun-14-2012

Answer Set Programming (ASP) is a well-established paradigm of declarative programming in close relationship with other declarative formalisms such as SAT Modulo Theories, Constraint Handling Rules, FO(.), PDDL and many others. Since its first informal editions, ASP systems have been compared in the now well-established ASP Competition. The Third (Open) ASP Competition, as the sequel to the ASP Competitions Series held at the University of Potsdam in Germany (2006-2007) and at the University of Leuven in Belgium in 2009, took place at the University of Calabria (Italy) in the first half of 2011. Participants competed on a pre-selected collection of benchmark problems, taken from a variety of domains as well as real world applications. The Competition ran on two tracks: the Model and Solve (M&S) Track, based on an open problem encoding, and open language, and open to any kind of system based on a declarative specification paradigm; and the System Track, run on the basis of fixed, public problem encodings, written in a standard ASP language. This paper discusses the format of the Competition and the rationale behind it, then reports the results for both tracks. Comparison with the second ASP competition and state-of-the-art solutions for some of the benchmark domains is eventually discussed. To appear in Theory and Practice of Logic Programming (TPLP).

artificial intelligence, logic & formal reasoning, planning & scheduling, (18 more...)

doi: 10.1017/S1471068412000105

1206.3111

Country:

Europe > Italy > Calabria (0.24)
Europe > Germany > Brandenburg > Potsdam (0.24)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.24)
(3 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

arXiv.org Machine LearningJun-14-2012

Statistical Consistency of Finite-dimensional Unregularized Linear Classification

Telgarsky, Matus

Binary linear classification operates as follows: obtain a new instance, determine a set of real-valued features, form their weighted combination, and output a label which is positive iff this combination is nonnegative. The interpretability, empirical performance, and theoretical depth of this scheme have all contributed to its continued popularity (Freund and Schapire, 1997, Friedman et al., 2000, Caruana and Niculescu-Mizil, 2006). In order to obtain the coefficients in the above weighting, convex optimization is typically employed. Specifically, rather than just trying to pick the weighting which makes the fewest mistakes over a finite sample -- which is computationally intractable -- consider instead paying attention to the amount by which these combinations clear the zero threshold, a quantity called the margin. Applying a convex penalty to these margins yields a convex optimization procedure, specifically one which can be specialized into both logistic regression and AdaBoost.

artificial intelligence, machine learning, span, (18 more...)

1206.3072

Country:

Asia > Middle East > Lebanon (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (0.90)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Kadri, Hachem, Rakotomamonjy, Alain, Bach, Francis, Preux, Philippe

Multiple Operator-valued Kernel Learning

arXiv.org Machine LearningJun-14-2012

Positive definite operator-valued kernels generalize the well-known notion of reproducing kernels, and are naturally adapted to multi-output learning situations. This paper addresses the problem of learning a finite linear combination of infinite-dimensional operator-valued kernels which are suitable for extending functional data analysis methods to nonlinear contexts. We study this problem in the case of kernel ridge regression for functional responses with an lr-norm constraint on the combination coefficients. The resulting optimization problem is more involved than those of multiple scalar-valued kernel learning since operator-valued kernels pose more technical and theoretical issues. We propose a multiple operator-valued kernel learning algorithm based on solving a system of linear operator equations by using a block coordinatedescent procedure. We experimentally validate our approach on a functional regression task in the context of finger movement prediction in brain-computer interfaces.

artificial intelligence, kernel, machine learning, (17 more...)

1203.1596

Country:

Europe > France > Normandy > Seine-Maritime > Rouen (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.36)