AITopics | Zemel, Richard S.

Collaborating Authors

Zemel, Richard S.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Collaborative Ranking With 17 Parameters

Volkovs, Maksims, Zemel, Richard S.

Neural Information Processing SystemsDec-31-2012

The primary application of collaborate filtering (CF) is to recommend a small set of items to a user, which entails ranking. Most approaches, however, formulate the CF problem as rating prediction, overlooking the ranking perspective. In this work we present a method for collaborative ranking that leverages the strengths of the two main CF approaches, neighborhood-and model-based. Our novel method is highly efficient, with only seventeen parameters to optimize and a single hyperparameter totune, and beats the state-of-the-art collaborative ranking methods. We also show that parameters learned on datasets from one item domain yield excellent resultson a dataset from very different item domain, without any retraining.

artificial intelligence, natural language, neighbor, (21 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report (0.88)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Efficient Parametric Projection Pursuit Density Estimation

Welling, Max, Zemel, Richard S., Hinton, Geoffrey E.

arXiv.org Machine LearningOct-19-2012

Product models of low dimensional experts are a powerful way to avoid the curse of dimensionality. We present the ``under-complete product of experts' (UPoE), where each expert models a one dimensional projection of the data. The UPoE is fully tractable and may be interpreted as a parametric probabilistic model for projection pursuit. Its ML learning rules are identical to the approximate learning rules proposed before for under-complete ICA. We also derive an efficient sequential learning algorithm and discuss its relationship to projection pursuit density estimation and feature induction algorithms for additive random field models.

artificial intelligence, machine learning, projection, (16 more...)

arXiv.org Machine Learning

1212.2513

Country: North America > Canada > Ontario > Toronto (0.16)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Active Collaborative Filtering

Boutilier, Craig, Zemel, Richard S., Marlin, Benjamin

arXiv.org Machine LearningOct-19-2012

Collaborative filtering (CF) allows the preferences of multiple users to be pooled to make recommendations regarding unseen products. We consider in this paper the problem of online and interactive CF: given the current ratings associated with a user, what queries (new ratings) would most improve the quality of the recommendations made? We cast this terms of expected value of information (EVOI); but the online computational cost of computing optimal queries is prohibitive. We show how offline prototyping and computation of bounds on EVOI can be used to dramatically reduce the required online computation. The framework we develop is general, but we focus on derivations and empirical study in the specific case of the multiple-cause vector quantization model.

artificial intelligence, machine learning, query, (18 more...)

arXiv.org Machine Learning

1212.2442

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Fast Exact Inference for Recursive Cardinality Models

Tarlow, Daniel, Swersky, Kevin, Zemel, Richard S., Adams, Ryan Prescott, Frey, Brendan J.

arXiv.org Machine LearningOct-16-2012

Cardinality potentials are a generally useful class of high order potential that affect probabilities based on how many of D binary variables are active. Maximum a posteriori (MAP) inference for cardinality potential models is well-understood, with efficient computations taking O(DlogD) time. Yet efficient marginalization and sampling have not been addressed as thoroughly in the machine learning community. We show that there exists a simple algorithm for computing marginal probabilities and drawing exact joint samples that runs in O(Dlog2 D) time, and we show how to frame the algorithm as efficient belief propagation in a low order tree-structured model that includes additional auxiliary variables. We then develop a new, more general class of models, termed Recursive Cardinality models, which take advantage of this efficiency. Finally, we show how to do efficient exact inference in models composed of a tree structure and a cardinality potential. We explore the expressive power of Recursive Cardinality models and empirically demonstrate their utility.

algorithm, artificial intelligence, neural network, (18 more...)

arXiv.org Machine Learning

1210.4899

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.34)

Add feedback

Collaborative Filtering and the Missing at Random Assumption

Marlin, Benjamin, Zemel, Richard S., Roweis, Sam, Slaney, Malcolm

arXiv.org Machine LearningJun-20-2012

Rating prediction is an important application, and a popular research topic in collaborative filtering. However, both the validity of learning algorithms, and the validity of standard testing procedures rest on the assumption that missing ratings are missing at random (MAR). In this paper we present the results of a user study in which we collect a random sample of ratings from current users of an online radio service. An analysis of the rating data collected in the study shows that the sample of random ratings has markedly different properties than ratings of user-selected songs. When asked to report on their own rating behaviour, a large number of users indicate they believe their opinion of a song does affect whether they choose to rate that song, a violation of the MAR condition. Finally, we present experimental results showing that incorporating an explicit model of the missing data mechanism can lead to significant improvements in prediction performance on the random sample of ratings.

artificial intelligence, machine learning, missing data, (17 more...)

arXiv.org Machine Learning

1206.5267

Country: North America > Canada > Ontario > Toronto (0.29)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Leisure & Entertainment (1.00)
Media > Radio (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Flexible Priors for Exemplar-based Clustering

Tarlow, Daniel, Zemel, Richard S., Frey, Brendan J.

arXiv.org Machine LearningJun-13-2012

Exemplar-based clustering methods have been shown to produce state-of-the-art results on a number of synthetic and real-world clustering problems. They are appealing because they offer computational benefits over latent-mean models and can handle arbitrary pairwise similarity measures between data points. However, when trying to recover underlying structure in clustering problems, tailored similarity measures are often not enough; we also desire control over the distribution of cluster sizes. Priors such as Dirichlet process priors allow the number of clusters to be unspecified while expressing priors over data partitions. To our knowledge, they have not been applied to exemplar-based models. We show how to incorporate priors, including Dirichlet process priors, into the recently introduced affinity propagation algorithm. We develop an efficient maxproduct belief propagation algorithm for our new model and demonstrate experimentally how the expanded range of clustering priors allows us to better recover true clusterings in situations where we have some information about the generating process.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1206.3294

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

A Framework for Optimizing Paper Matching

Charlin, Laurent, Zemel, Richard S., Boutilier, Craig

arXiv.org Artificial IntelligenceFeb-14-2012

At the heart of many scientific conferences is the problem of matching submitted papers to suitable reviewers. Arriving at a good assignment is a major and important challenge for any conference organizer. In this paper we propose a framework to optimize paper-to-reviewer assignments. Our framework uses suitability scores to measure pairwise affinity between papers and reviewers. We show how learning can be used to infer suitability scores from a small set of provided scores, thereby reducing the burden on reviewers and organizers. We frame the assignment problem as an integer program and propose several variations for the paper-to-reviewer matching domain. We also explore how learning and matching interact. Experiments on two conference data sets examine the performance of several learning methods as well as the effectiveness of the matching formulations.

optimization problem, reviewer, survey article, (23 more...)

arXiv.org Artificial Intelligence

1202.3706

Country:

Europe (0.93)
North America > Canada > Ontario > Toronto (0.15)
North America > United States > Pennsylvania (0.14)
North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Loss-sensitive Training of Probabilistic Conditional Random Fields

Volkovs, Maksims N., Larochelle, Hugo, Zemel, Richard S.

arXiv.org Machine LearningJul-9-2011

We consider the problem of training probabilistic conditional random fields (CRFs) in the context of a task where performance is measured using a specific loss function. While maximum likelihood is the most common approach to training CRFs, it ignores the inherent structure of the task's loss function. We describe alternatives to maximum likelihood which take that loss into account. These include a novel adaptation of a loss upper bound from the structured SVMs literature to the CRF context, as well as a new loss-inspired KL divergence objective which relies on the probabilistic nature of CRFs. These loss-sensitive objectives are compared to maximum likelihood using ranking as a benchmark task. This comparison confirms the importance of incorporating loss information in the probabilistic training of CRFs, with the loss-inspired KL outperforming all other objectives.

artificial intelligence, bayesian inference, objective, (17 more...)

arXiv.org Machine Learning

1107.1805

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.77)

Add feedback

Ranking via Sinkhorn Propagation

Adams, Ryan Prescott, Zemel, Richard S.

arXiv.org Machine LearningJun-13-2011

It is of increasing importance to develop learning methods for ranking. In contrast to many learning objectives, however, the ranking problem presents difficulties due to the fact that the space of permutations is not smooth. In this paper, we examine the class of rank-linear objective functions, which includes popular metrics such as precision and discounted cumulative gain. In particular, we observe that expectations of these gains are completely characterized by the marginals of the corresponding distribution over permutation matrices. Thus, the expectations of rank-linear objectives can always be described through locations in the Birkhoff polytope, i.e., doubly-stochastic matrices (DSMs). We propose a technique for learning DSM-based ranking functions using an iterative projection operator known as Sinkhorn normalization. Gradients of this operator can be computed via backpropagation, resulting in an algorithm we call Sinkhorn propagation, or SinkProp. This approach can be combined with a wide range of gradient-based approaches to rank learning. We demonstrate the utility of SinkProp on several information retrieval data sets.

artificial intelligence, matrix, neural network, (17 more...)

arXiv.org Machine Learning

1106.1925

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Interpreting Graph Cuts as a Max-Product Algorithm

Tarlow, Daniel, Givoni, Inmar E., Zemel, Richard S., Frey, Brendan J.

arXiv.org Machine LearningMay-5-2011

The maximum a posteriori (MAP) configuration of binary variable models with submodular graph-structured energy functions can be found efficiently and exactly by graph cuts. Max-product belief propagation (MP) has been shown to be suboptimal on this class of energy functions by a canonical counterexample where MP converges to a suboptimal fixed point (Kulesza & Pereira, 2008). In this work, we show that under a particular scheduling and damping scheme, MP is equivalent to graph cuts, and thus optimal. We explain the apparent contradiction by showing that with proper scheduling and damping, MP always converges to an optimal fixed point. Thus, the canonical counterexample only shows the suboptimality of MP with a particular suboptimal choice of schedule and damping. With proper choices, MP is optimal.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1105.1178

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback