Media
Face Video Retrieval via Deep Learning of Binary Hash Representations
Dong, Zhen (Beijing Institute of Technology) | Jia, Su (Stony Brook University) | Wu, Tianfu (Beijing University of Posts and Telecommunications and University of California, Los Angeles) | Pei, Mingtao (Beijing Institute of Technology)
Retrieving faces from large mess of videos is an attractive research topic with wide range of applications. Its challenging problems are large intra-class variations, and tremendous time and space complexity. In this paper, we develop a new deep convolutional neural network (deep CNN) to learn discriminative and compact binary representations of faces for face video retrieval. The network integrates feature extraction and hash learning into a unified optimization framework for the optimal compatibility of feature extractor and hash functions. In order to better initialize the network, the low-rank discriminative binary hashing is proposed to pre-learn hash functions during the training procedure. Our method achieves excellent performances on two challenging TV-Series datasets.
A Joint Model for Question Answering over Multiple Knowledge Bases
Zhang, Yuanzhe (Institute of Automation, Chinese Academy of Sciences) | He, Shizhu (Institute of Automation, Chinese Academy of Sciences) | Liu, Kang (Institute of Automation, Chinese Academy of Sciences) | Zhao, Jun (Institute of Automation, Chinese Academy of Sciences)
As the amount of knowledge bases (KBs) grows rapidly, the problem of question answering (QA) over multiple KBs has drawn more attention. The most significant distinction between multiple KB-QA and single KB-QA is that the former must consider the alignments between KBs. The pipeline strategy first constructs the alignments independently, and then uses the obtained alignments to construct queries. However, alignment construction is not a trivial task, and the introduced noises would be passed on to query construction. By contrast, we notice that alignment construction and query construction are interactive steps, and jointly considering them would be beneficial. To this end, we present a novel joint model based on integer linear programming (ILP), uniting these two procedures into a uniform framework. The experimental results demonstrate that the proposed approach outperforms state-of-the-art systems, and is able to improve the performance of both alignment construction and query construction.
News Verification by Exploiting Conflicting Social Viewpoints in Microblogs
Jin, Zhiwei (Institute of Computing Technology, Chinese Academy of Sciences) | Cao, Juan (Institute of Computing Technology, Chinese Academy of Sciences) | Zhang, Yongdong (Institute of Computing Technology, Chinese Academy of Sciences) | Luo, Jiebo (University of Rochester)
Fake news spreading in social media severely jeopardizes the veracity of online content. Fortunately, with the interactive and open features of microblogs, skeptical and opposing voices against fake news always arise along with it. The conflicting information, ignored by existing studies, is crucial for news verification. In this paper, we take advantage of this "wisdom of crowds" information to improve news verification by mining conflicting viewpoints in microblogs. First, we discover conflicting viewpoints in news tweets with a topic model method. Based on identified tweets' viewpoints, we then build a credibility propagation network of tweets linked with supporting or opposing relations. Finally, with iterative deduction, the credibility propagation on the network generates the final evaluation result for news. Experiments conducted on a real-world data set show that the news verification performance of our approach significantly outperforms those of the baseline approaches.
Topical Analysis of Interactions Between News and Social Media
Hua, Ting (Virginia Polytechnic Institute and State University) | Ning, Yue (Virginia Polytechnic Institute and State University) | Chen, Feng (State University of New York at Albany) | Lu, Chang-Tien (Virginia Polytechnic Institute and State University) | Ramakrishnan, Naren (Virginia Polytechnic Institute and State University)
The analysis of interactions between social media and traditional news streams is becoming increasingly relevant for a variety of applications, including: understanding the underlying factors that drive the evolution of data sources, tracking the triggers behind events, and discovering emerging trends.Researchers have explored such interactions by examining volume changes or information diffusions,however, most of them ignore the semantical and topical relationships between news and social media data.Our work is the first attempt to study how news influences social media, or inversely, based on topical knowledge.We propose a hierarchical Bayesian model that jointly models the news and social media topics and their interactions.We show that our proposed model can capture distinct topics for individual datasets as well as discover the topic influences among multiple datasets.By applying our model to large sets of news and tweets, we demonstrate its significant improvement over baseline methods and explore its power in the discovery of interesting patterns for real world cases.
Infinite Plaid Models for Infinite Bi-Clustering
Ishiguro, Katsuhiko (NTT Corporation) | Sato, Issei (The University of Tokyo) | Nakano, Masahiro (NTT Corporation) | Kimura, Akisato (NTT Corporation) | Ueda, Naonori (NTT Corporation)
We propose a probabilistic model for non-exhaustive and overlapping (NEO) bi-clustering. Our goal is to extract a few sub-matrices from the given data matrix, where entries of a sub-matrix are characterized by a specific distribution or parameters. Existing NEO biclustering methods typically require the number of sub-matrices to be extracted, which is essentially difficult to fix a priori. In this paper, we extend the plaid model, known as one of the best NEO bi-clustering algorithms, to allow infinite bi-clustering; NEO bi-clustering without specifying the number of sub-matrices. Our model can represent infinite sub-matrices formally. We develop a MCMC inference without the finite truncation, which potentially addresses all possible numbers of sub-matrices. Experiments quantitatively and qualitatively verify the usefulness of the proposed model. The results reveal that our model can offer more precise and in-depth analysis of sub-matrices.
Optimal Discrete Matrix Completion
Huo, Zhouyuan (University of Texas at Arlington) | Liu, Ji (University of Rochester) | Huang, Heng (University of Texas at Arlington)
In recent years, matrix completion methods have been successfully applied to solve recommender system applications. Most of them focus on the matrix completion problem in real number domain, and produce continuous prediction values. However, these methods are not appropriate in some occasions where the entries of matrix are discrete values, such as movie ratings prediction, social network relation and interaction prediction, because their continuous outputs are not probabilities and uninterpretable. In this case, an additional step to process the continuous results with either heuristic threshold parameters or complicated mapping is necessary, while it is inefficient and may diverge from the optimal solution. There are a few matrix completion methods working on discrete number domain, however, they are not applicable to sparse and large-scale data set. In this paper, we propose a novel optimal discrete matrix completion model, which is able to learn optimal thresholds automatically and also guarantees an exact low-rank structure of the target matrix. We use stochastic gradient descent algorithm with momentum method to optimize the new objective function and speed up optimization. In the experiments, it is proved that our method can predict discrete values with high accuracy, very close to or even better than these values obtained by carefully tuned thresholds on Movielens and YouTube data sets. Meanwhile, our model is able to handle online data and easy to parallelize.
Collective Noise Contrastive Estimation for Policy Transfer Learning
Zhang, Weinan (University College London) | Paquet, Ulrich (Microsoft Research) | Hofmann, Katja (Microsoft Research)
We address the problem of learning behaviour policies to optimise online metrics from heterogeneous usage data. While online metrics, e.g., click-through rate, can be optimised effectively using exploration data, such data is costly to collect in practice, as it temporarily degrades the user experience. Leveraging related data sources to improve online performance would be extremely valuable, but is not possible using current approaches. We formulate this task as a policy transfer learning problem, and propose a first solution, called collective noise contrastive estimation (collective NCE). NCE is an efficient solution to approximating the gradient of a log-softmax objective. Our approach jointly optimises embeddings of heterogeneous data to transfer knowledge from the source domain to the target domain. We demonstrate the effectiveness of our approach by learning an effective policy for an online radio station jointly from user-generated playlists, and usage data collected in an exploration bucket.
Expressive Recommender Systems through Normalized Nonnegative Models
Stark, Cyril J. (Massachusetts Institute of Technology)
We introduce normalized nonnegative models (NNM) for explorative data analysis. NNMs are partial convexifications of models from probability theory. We demonstrate their value at the example of item recommendation. We show that NNM-based recommender systems satisfy three criteria that all recommender systems should ideally satisfy: high predictive power, computational tractability, and expressive representations of users and items. Expressive user and item representations are important in practice to succinctly summarize the pool of customers and the pool of items. In NNMs, user representations are expressive because each user's preference can be regarded as normalized mixture of preferences of stereotypical users. The interpretability of item and user representations allow us to arrange properties of items (e.g., genres of movies or topics of documents) or users (e.g., personality traits) hierarchically.
Learning the Preferences of Ignorant, Inconsistent Agents
Evans, Owain (University of Oxford) | Stuhlmueller, Andreas (Stanford University) | Goodman, Noah (Stanford University)
An important use of machine learning is to learn what people value. What posts or photos should a user be shown? Which jobs or activities would a person find rewarding? In each case, observations of people's past choices can inform our inferences about their likes and preferences. If we assume that choices are approximately optimal according to some utility function, we can treat preference inference as Bayesian inverse planning. That is, given a prior on utility functions and some observed choices, we invert an optimal decision-making process to infer a posterior distribution on utility functions. However, people often deviate from approximate optimality. They have false beliefs, their planning is sub-optimal, and their choices may be temporally inconsistent due to hyperbolic discounting and other biases. We demonstrate how to incorporate these deviations into algorithms for preference inference by constructing generative models of planning for agents who are subject to false beliefs and time inconsistency. We explore the inferences these models make about preferences, beliefs, and biases. We present a behavioral experiment in which human subjects perform preference inference given the same observations of choices as our model. Results show that human subjects (like our model) explain choices in terms of systematic deviations from optimal behavior and suggest that they take such deviations into account when inferring preferences.
Semantic Community Identification in Large Attribute Networks
Wang, Xiao (Tianjin University) | Jin, Di (Tianjin University) | Cao, Xiaochun (Chinese Academy of Sciences) | Yang, Liang (Chinese Academy of Sciences) | Zhang, Weixiong (Washington University in St. Louis)
Identification of modular or community structures of a network is a key to understanding the semantics and functions of the network. While many network community detection methods have been developed, which primarily explore network topologies, they provide little semantic information of the communities discovered. Although structures and semantics are closely related, little effort has been made to discover and analyze these two essential network properties together. By integrating network topology and semantic information on nodes, e.g., node attributes, we study the problems of detection of communities and inference of their semantics simultaneously. We propose a novel nonnegative matrix factorization (NMF) model with two sets of parameters, the community membership matrix and community attribute matrix, and present efficient updating rules to evaluate the parameters with a convergence guarantee. The use of node attributes improves upon community detection and provides a semantic interpretation to the resultant network communities. Extensive experimental results on synthetic and real-world networks not only show the superior performance of the new method over the state-of-the-art approaches, but also demonstrate its ability to semantically annotate the communities.