Supervised Learning
A Mouse-Trajectory Based Model for Predicting Query-URL Relevance
Hengjie, Song ( Nanyang Technological University Baidu Inc. ) | Liao, Ruoxue (Baidu Inc.) | Zhang, Xiangliang (King Abdullah University of Science and Technology) | Miao, Chunyan (Nanyang Technological University) | Yang, Qiang (Hong Kong University of Science and Technology)
For the learning-to-ranking algorithms used in commercial search engines, a conventional way to generate the training examples is to employ professional annotators to label the relevance of query-url pairs. Since label quality depends on the expertise of annotators to a large extent, this process is time-consuming and labor-intensive. Automatically generating labels from click-through data has been well studied to have comparable or better performance than human judges. Click-through data present usersโ action and imply their satisfaction on search results, but exclude the interactions between users and search results beyond the page-view level (e.g., eye and mouse movements). This paper proposes a novel approach to comprehensively consider the information underlying mouse trajectory and click-through data so as to describe user behaviors more objectively and achieve a better understanding of the user experience. By integrating multi-sources data, the proposed approach reveals that the relevance labels of query-url pairs are related to positions of urls and usersโ behavioral features. Based on their correlations, query-url pairs can be labeled more accurately and search results are more satisfactory to users. The experiments that are conducted on the most popular Chinese commercial search engine (Baidu) validated the rationality of our research motivation and proved that the proposed approach outperformed the state-of-the-art methods.
On the Difficulty of Nearest Neighbor Search
He, Junfeng, Kumar, Sanjiv, Chang, Shih-Fu
Fast approximate nearest neighbor (NN) search in large databases is becoming popular. Several powerful learning-based formulations have been proposed recently. However, not much attention has been paid to a more fundamental question: how difficult is (approximate) nearest neighbor search in a given data set? And which data properties affect the difficulty of nearest neighbor search and how? This paper introduces the first concrete measure called Relative Contrast that can be used to evaluate the influence of several crucial data characteristics such as dimensionality, sparsity, and database size simultaneously in arbitrary normed metric spaces. Moreover, we present a theoretical analysis to prove how the difficulty measure (relative contrast) determines/affects the complexity of Local Sensitive Hashing, a popular approximate NN search method. Relative contrast also provides an explanation for a family of heuristic hashing algorithms with good practical performance based on PCA. Finally, we show that most of the previous works in measuring NN search meaningfulness/difficulty can be derived as special asymptotic cases for dense vectors of the proposed measure.
Reasoning about Uncertainty in Metric Spaces
We set up a model for reasoning about metric spaces with belief theoretic measures. The uncertainty in these spaces stems from both probability and metric. To represent both aspect of uncertainty, we choose an expected distance function as a measure of uncertainty. A formal logical system is constructed for the reasoning about expected distance. Soundness and completeness are shown for this logic. For reasoning on product metric space with uncertainty, a new metric is defined and shown to have good properties.
Online Structured Prediction via Coactive Learning
Shivaswamy, Pannaga, Joachims, Thorsten
We propose Coactive Learning as a model of interaction between a learning system and a human user, where both have the common goal of providing results of maximum utility to the user. At each step, the system (e.g. search engine) receives a context (e.g. query) and predicts an object (e.g. ranking). The user responds by correcting the system if necessary, providing a slightly improved -- but not necessarily optimal -- object as feedback. We argue that such feedback can often be inferred from observable user behavior, for example, from clicks in web-search. Evaluating predictions by their cardinal utility to the user, we propose efficient learning algorithms that have ${\cal O}(\frac{1}{\sqrt{T}})$ average regret, even though the learning algorithm never observes cardinal utility values as in conventional online learning. We demonstrate the applicability of our model and learning algorithms on a movie recommendation task, as well as ranking for web-search.
Output Space Search for Structured Prediction
Doppa, Janardhan Rao, Fern, Alan, Tadepalli, Prasad
We consider a framework for structured prediction based on search in the space of complete structured outputs. Given a structured input, an output is produced by running a time-bounded search procedure guided by a learned cost function, and then returning the least cost output uncovered during the search. This framework can be instantiated for a wide range of search spaces and search procedures, and easily incorporates arbitrary structured-prediction loss functions. In this paper, we make two main technical contributions. First, we define the limited-discrepancy search space over structured outputs, which is able to leverage powerful classification learning algorithms to improve the search space quality. Second, we give a generic cost function learning approach, where the key idea is to learn a cost function that attempts to mimic the behavior of conducting searches guided by the true loss function. Our experiments on six benchmark domains demonstrate that using our framework with only a small amount of search is sufficient for significantly improving on state-of-the-art structured-prediction performance.
Object Detection with Grammar Models
Girshick, Ross B., Felzenszwalb, Pedro F., McAllester, David A.
Compositional models provide an elegant formalism for representing the visual appearance of highly variable objects. While such models are appealing from a theoretical point of view, it has been difficult to demonstrate that they lead to performance advantages on challenging datasets. Here we develop a grammar model for person detection and show that it outperforms previous high-performance systems on the PASCAL benchmark. Our model represents people using a hierarchy of deformable parts, variable structure and an explicit model of occlusion for partially visible objects. To train the model, we introduce a new discriminative framework for learning structured prediction models from weakly-labeled data.
Transfer Learning by Borrowing Examples for Multiclass Object Detection
Lim, Joseph J., Salakhutdinov, Ruslan R., Torralba, Antonio
Despite the recent trend of increasingly large datasets for object detection, there still exist many classes with few training examples. To overcome this lack of training datafor certain classes, we propose a novel way of augmenting the training data for each class by borrowing and transforming examples from other classes. Our model learns which training instances from other classes to borrow and how to transform the borrowed examples so that they become more similar to instances from the target class. Our experimental results demonstrate that our new object detector, with borrowed and transformed examples, improves upon the current state-of-the-art detector on the challenging SUN09 object detection dataset.
Maximum Margin Multi-Label Structured Prediction
We study multi-label prediction for structured output spaces, a problem that occurs, for example, in object detection in images, secondary structure prediction in computational biology, and graph matching with symmetries. Conventional multi-label classification techniques are typically not applicable in this situation, because they require explicit enumeration of the label space, which is infeasible in case of structured outputs. Relying on techniques originally designed for single- label structured prediction, in particular structured support vector machines, results in reduced prediction accuracy, or leads to infeasible optimization problems. In this work we derive a maximum-margin training formulation for multi-label structured prediction that remains computationally tractable while achieving high prediction accuracy. It also shares most beneficial properties with single-label maximum-margin approaches, in particular a formulation as a convex optimization problem, efficient working set training, and PAC-Bayesian generalization bounds.
Multi-armed bandits on implicit metric spaces
The multi-armed bandit (MAB) setting is a useful abstraction of many online learning tasks which focuses on the trade-off between exploration and exploitation. In this setting, an online algorithm has a fixed set of alternatives ("arms"), and in each round it selects one arm and then observes the corresponding reward. While the case of small number of arms is by now well-understood, a lot of recent work has focused on multi-armed bandits with (infinitely) many arms, where one needs to assume extra structure in order to make the problem tractable. In particular, in the Lipschitz MAB problem there is an underlying similarity metric space, known to the algorithm, such that any two arms that are close in this metric space have similar payoffs. In this paper we consider the more realistic scenario in which the metric space is *implicit* -- it is defined by the available structure but not revealed to the algorithm directly. Specifically, we assume that an algorithm is given a tree-based classification of arms. For any given problem instance such a classification implicitly defines a similarity metric space, but the numerical similarity information is not available to the algorithm. We provide an algorithm for this setting, whose performance guarantees (almost) match the best known guarantees for the corresponding instance of the Lipschitz MAB problem.