Collaborating Authors

Li, Hang

Ranking Measures and Loss Functions in Learning to Rank

Neural Information Processing Systems

Learning to rank has become an important research topic in machine learning. While most learning-to-rank methods learn the ranking function by minimizing the loss functions, it is the ranking measures (such as NDCG and MAP) that are used to evaluate the performance of the learned ranking function. In this work, we reveal the relationship between ranking measures and loss functions in learning-to-rank methods, such as Ranking SVM, RankBoost, RankNet, and ListMLE. We show that these loss functions are upper bounds of the measure-based ranking errors. As a result, the minimization of these loss functions will lead to the maximization of the ranking measures.

Statistical Consistency of Top-k Ranking

Neural Information Processing Systems

This paper is concerned with the consistency analysis on listwise ranking methods. Among various ranking methods, the listwise methods have competitive performances on benchmark datasets and are regarded as one of the state-of-the-art approaches. Most listwise ranking methods manage to optimize ranking on the whole list (permutation) of objects, however, in practical applications such as information retrieval, correct ranking at the top k positions is much more important. This paper aims to analyze whether existing listwise ranking methods are statistically consistent in the top-k setting. For this purpose, we define a top-k ranking framework, where the true loss (and thus the risks) are defined on the basis of top-k subgroup of permutations.

Global Ranking Using Continuous Conditional Random Fields

Neural Information Processing Systems

This paper studies global ranking problem by learning to rank methods. Conventional learning to rank methods are usually designed for local ranking', in the sense that the ranking model is defined on a single object, for example, a document in information retrieval. For many applications, this is a very loose approximation. Relations always exist between objects and it is better to define the ranking model as a function on all the objects to be ranked (i.e., the relations are also included). This paper refers to the problem as global ranking and proposes employing a Continuous Conditional Random Fields (CRF) for conducting the learning task.

A Deep Architecture for Matching Short Texts

Neural Information Processing Systems

Many machine learning problems can be interpreted as learning for matching two types of objects (e.g., images and captions, users and products, queries and documents). The matching level of two objects is usually measured as the inner product in a certain feature space, while the modeling effort focuses on mapping of objects from the original space to the feature space. This schema, although proven successful on a range of matching tasks, is insufficient for capturing the rich structure in the matching process of more complicated objects. In this paper, we propose a new deep architecture to more effectively model the complicated matching relations between two objects from heterogeneous domains. More specifically, we apply this model to matching tasks in natural language, e.g., finding sensible responses for a tweet, or relevant answers to a given question.

Convolutional Neural Network Architectures for Matching Natural Language Sentences

Neural Information Processing Systems

Semantic matching is of central importance to many natural language tasks \cite{bordes2014semantic,RetrievalQA}. A successful matching algorithm needs to adequately model the internal structures of language objects and the interaction between them. As a step toward this goal, we propose convolutional neural network models for matching two sentences, by adapting the convolutional strategy in vision and speech. The proposed models not only nicely represent the hierarchical structures of sentences with their layer-by-layer composition and pooling, but also capture the rich matching patterns at different levels. Our models are rather generic, requiring no prior knowledge on language, and can hence be applied to matching tasks of different nature and in different languages.

A Multimodal Alerting System for Online Class Quality Assurance Artificial Intelligence

Online 1 on 1 class is created for more personalized learning experience. It demands a large number of teaching resources, which are scarce in China. To alleviate this problem, we build a platform (marketplace), i.e., \emph{Dahai} to allow college students from top Chinese universities to register as part-time instructors for the online 1 on 1 classes. To warn the unqualified instructors and ensure the overall education quality, we build a monitoring and alerting system by utilizing multimodal information from the online environment. Our system mainly consists of two key components: banned word detector and class quality predictor. The system performance is demonstrated both offline and online. By conducting experimental evaluation of real-world online courses, we are able to achieve 74.3\% alerting accuracy in our production environment.

Toward Building Conversational Recommender Systems: A Contextual Bandit Approach Machine Learning

Contextual bandit algorithms have gained increasing popularity in recommender systems, because they can learn to adapt recommendations by making exploration-exploitation trade-off. Recommender systems equipped with traditional contextual bandit algorithms are usually trained with behavioral feedback (e.g., clicks) from users on items. The learning speed can be slow because behavioral feedback by nature does not carry sufficient information. As a result, extensive exploration has to be performed. To address the problem, we propose conversational recommendation in which the system occasionally asks questions to the user about her interest. We first generalize contextual bandit to leverage not only behavioral feedback (arm-level feedback), but also verbal feedback (users' interest on categories, topics, etc.). We then propose a new UCB- based algorithm, and theoretically prove that the new algorithm can indeed reduce the amount of exploration in learning. We also design several strategies for asking questions to further optimize the speed of learning. Experiments on synthetic data, Yelp data, and news recommendation data from Toutiao demonstrate the efficacy of the proposed algorithm.

Signal Demodulation with Machine Learning Methods for Physical Layer Visible Light Communications: Prototype Platform, Open Dataset and Algorithms Machine Learning

In this paper, we investigate the design and implementation of machine learning (ML) based demodulation methods in the physical layer of visible light communication (VLC) systems. We build a flexible hardware prototype of an end-to-end VLC system, from which the received signals are collected as the real data. The dataset is available online, which contains eight types of modulated signals. Then, we propose three ML demodulators based on convolutional neural network (CNN), deep belief network (DBN), and adaptive boosting (AdaBoost), respectively. Specifically, the CNN based demodulator converts the modulated signals to images and recognizes the signals by the image classification. The proposed DBN based demodulator contains three restricted Boltzmann machines (RBMs) to extract the modulation features. The AdaBoost method includes a strong classifier that is constructed by the weak classifiers with the k-nearest neighbor (KNN) algorithm. These three demodulators are trained and tested by our online open dataset. Experimental results show that the demodulation accuracy of the three data-driven demodulators drops as the transmission distance increases. A higher modulation order negatively influences the accuracy for a given transmission distance. Among the three ML methods, the AdaBoost modulator achieves the best performance.

Coupling Distributed and Symbolic Execution for Natural Language Queries Artificial Intelligence

Building neural networks to query a knowledge base (a table) with natural language is an emerging research topic in deep learning. An executor for table querying typically requires multiple steps of execution because queries may have complicated structures. In previous studies, researchers have developed either fully distributed executors or symbolic executors for table querying. A distributed executor can be trained in an end-to-end fashion, but is weak in terms of execution efficiency and explicit interpretability. A symbolic executor is efficient in execution, but is very difficult to train especially at initial stages. In this paper, we propose to couple distributed and symbolic execution for natural language queries, where the symbolic executor is pretrained with the distributed executor's intermediate execution results in a step-by-step fashion. Experiments show that our approach significantly outperforms both distributed and symbolic executors, exhibiting high accuracy, high learning efficiency, high execution efficiency, and high interpretability.

Deep Active Learning for Dialogue Generation Artificial Intelligence

We propose an online, end-to-end, neural generative conversational model for open-domain dialogue. It is trained using a unique combination of offline two-phase supervised learning and online human-in-the-loop active learning. While most existing research proposes offline supervision or hand-crafted reward functions for online reinforcement, we devise a novel interactive learning mechanism based on hamming-diverse beam search for response generation and one-character user-feedback at each step. Experiments show that our model inherently promotes the generation of semantically relevant and interesting responses, and can be used to train agents with customized personas, moods and conversational styles.