Representation Of Examples
Change Point Methods on a Sequence of Graphs
Zambon, Daniele, Alippi, Cesare, Livi, Lorenzo
The present paper considers a finite sequence of graphs, e.g., coming from technological, biological, and social networks, each of which is modelled as a realization of a graph-valued random variable, and proposes a methodology to identify possible changes in stationarity in its generating stochastic process. In order to cover a large class of applications, we consider a general family of attributed graphs, chatacterized by a possible variable topology (edges and vertices) also in the stationary case. A Change Point Method (CPM) approach is proposed, that (i) maps graphs into a vector domain; (ii) applies a suitable statistical test; (iii) detects the change --if any-- according to a confidence level and provides an estimate for its time of occurrence. Two specific CPMs are proposed: one detecting shifts in the distribution mean, the other addressing generic changes affecting the distribution. We ground our proposal with theoretical results showing how to relate the inference attained in the numerical vector space to the graph domain, and vice versa. Finally, simulations on epileptic-seizure detection problems are conducted on real-world data providing evidence for the CPMs effectiveness.
Improving Knowledge Graph Embedding Using Simple Constraints
Ding, Boyang, Wang, Quan, Wang, Bin, Guo, Li
Embedding knowledge graphs (KGs) into continuous vector spaces is a focus of current research. Early works performed this task via simple models developed over KG triples. Recent attempts focused on either designing more complicated triple scoring models, or incorporating extra information beyond triples. This paper, by contrast, investigates the potential of using very simple constraints to improve KG embedding. We examine non-negativity constraints on entity representations and approximate entailment constraints on relation representations. The former help to learn compact and interpretable representations for entities. The latter further encode regularities of logical entailment between relations into their distributed representations. These constraints impose prior beliefs upon the structure of the embedding space, without negative impacts on efficiency or scalability. Evaluation on WordNet, Freebase, and DBpedia shows that our approach is simple yet surprisingly effective, significantly and consistently outperforming competitive baselines. The constraints imposed indeed improve model interpretability, leading to a substantially increased structuring of the embedding space. Code and data are available at https://github.com/iieir-km/ComplEx-NNE_AER.
Extrofitting: Enriching Word Representation and its Vector Space with Semantic Lexicons
Jo, Hwiyeol, Choi, Stanley Jungkyu
We propose post-processing method for enriching not only word representation but also its vector space using semantic lexicons, which we call extrofitting. The method consists of 3 steps as follows: (i) Expanding 1 or more dimension(s) on all the word vectors, filling with their representative value. (ii) Transferring semantic knowledge by averaging each representative values of synonyms and filling them in the expanded dimension(s). These two steps make representations of the synonyms close together. (iii) Projecting the vector space using Linear Discriminant Analysis, which eliminates the expanded dimension(s) with semantic knowledge. When experimenting with GloVe, we find that our method outperforms Faruqui's retrofitting on some of word similarity task. We also report further analysis on our method in respect to word vector dimensions, vocabulary size as well as other well-known pretrained word vectors (e.g., Word2Vec, Fasttext).
Exploring Partially Observed Networks with Nonparametric Bandits
Madhawa, Kaushalya, Murata, Tsuyoshi
Real-world networks such as social and communication networks are too large to be observed entirely. Such networks are often partially observed such that network size, network topology, and nodes of the original network are unknown. In this paper we formalize the Adaptive Graph Exploring problem. We assume that we are given an incomplete snapshot of a large network and additional nodes can be discovered by querying nodes in the currently observed network. The goal of this problem is to maximize the number of observed nodes within a given query budget. Querying which set of nodes maximizes the size of the observed network? We formulate this problem as an exploration-exploitation problem and propose a novel nonparametric multi-arm bandit (MAB) algorithm for identifying which nodes to be queried. Our contributions include: (1) $i$KNN-UCB, a novel nonparametric MAB algorithm, applies $k$-nearest neighbor UCB to the setting when the arms are presented in a vector space, (2) provide theoretical guarantee that $i$KNN-UCB algorithm has sublinear regret, and (3) applying $i$KNN-UCB algorithm on synthetic networks and real-world networks from different domains, we show that our method discovers up to 40% more nodes compared to existing baselines.
Structural query-by-committee
Tosh, Christopher, Dasgupta, Sanjoy
We introduce interactive structure learning, an abstract problem that encompasses many interactive learning tasks that have traditionally been studied in isolation, including active learning of binary classifiers, interactive clustering, interactive embedding, and active learning of structured output predictors. These problems include variants of both supervised and unsupervised tasks, and allow many different types of feedback, from binary labels to must-link/cannot-link constraints to similarity assessments to structured outputs. Despite these surface differences, they conform to a common template that allows them to be fruitfully unified. In interactive structure learning, there is a space of items X --for instance, an input space on which a classifier is to be learned, or points to cluster, or points to embed in a metric space--and the goal is to learn a structure on X, chosen from a family G. This set G could consist, for example, of all linear classifiers on X, or all hierarchical clusterings of X, or all knowledge graphs on X.
Revisiting the Vector Space Model: Sparse Weighted Nearest-Neighbor Method for Extreme Multi-Label Classification
Aoshima, Tatsuhiro, Kobayashi, Kei, Minami, Mihoko
Machine learning has played an important role in information retrieval (IR) in recent times. In search engines, for example, query keywords are accepted and documents are returned in order of relevance to the given query; this can be cast as a multi-label ranking problem in machine learning. Generally, the number of candidate documents is extremely large (from several thousand to several million); thus, the classifier must handle many labels. This problem is referred to as extreme multi-label classification (XMLC). In this paper, we propose a novel approach to XMLC termed the Sparse Weighted Nearest-Neighbor Method. This technique can be derived as a fast implementation of state-of-the-art (SOTA) one-versus-rest linear classifiers for very sparse datasets. In addition, we show that the classifier can be written as a sparse generalization of a representer theorem with a linear kernel. Furthermore, our method can be viewed as the vector space model used in IR. Finally, we show that the Sparse Weighted Nearest-Neighbor Method can process data points in real time on XMLC datasets with equivalent performance to SOTA models, with a single thread and smaller storage footprint. In particular, our method exhibits superior performance to the SOTA models on a dataset with 3 million labels.
Investigating Inner Properties of Multimodal Representation and Semantic Compositionality With Brain-Based Componential Semantics
Wang, Shaonan (Institute of Automation, Chinese Academy of Sciences) | Zhang, Jiajun (Institute of Automation, Chinese Academy of Sciences) | Lin, Nan (Institute of Psychology, Chinese Academy of Sciences) | Zong, Chengqing (Institute of Automation, Chinese Academy of Sciences)
Multimodal models have been proven to outperform text-based approaches on learning semantic representations. However, it still remains unclear what properties are encoded in multimodal representations, in what aspects do they outperform the single-modality representations, and what happened in the process of semantic compositionality in different input modalities. Considering that multimodal models are originally motivated by human concept representations, we assume that correlating multimodal representations with brain-based semantics would interpret their inner properties to answer the above questions. To that end, we propose simple interpretation methods based on brain-based componential semantics. First we investigate the inner properties of multimodal representations by correlating them with corresponding brain-based property vectors. Then we map the distributed vector space to the interpretable brain-based componential space to explore the inner properties of semantic compositionality. Ultimately, the present paper sheds light on the fundamental questions of natural language understanding, such as how to represent the meaning of words and how to combine word meanings into larger units.
Diagnosing University Student Subject Proficiency and Predicting Degree Completion in Vector Space
Luo, Yuetian (UW-Madison) | Pardos, Zachary A. (UC Berkeley)
We investigate the issues of undergraduate on-time graduation with respect to subject proficiencies through the lens of representation learning, training a student vector embeddings from a dataset of 8 years of course enrollments. We compare the per-semester student representations of a cohort of undergraduate Integrative Biology majors to those of graduated students in subject areas involved in their degree requirements. The result is an embedding rich in information about the relationships between majors and pathways taken by students which encoded enough information to improve prediction accuracy of on-time graduation to 95%, up from a baseline of 87.3%. Challenges to preparation of the data for student vectorization and sourcing of validation sets for optimization are discussed.
Generative Adversarial Network Based Heterogeneous Bibliographic Network Representation for Personalized Citation Recommendation
Cai, Xiaoyan (School of Automation, Northwestern Polytechnical University) | Han, Junwei (School of Automation, Northwestern Polytechnical University) | Yang, Libin (School of Automation, Northwestern Polytechnical University)
Network representation has been recently exploited for many applications, such as citation recommendation, multi-label classification and link prediction. It learns low-dimensional vector representation for each vertex in networks. Existing network representation methods only focus on incomplete aspects of vertex information (i.e., vertex content, network structure or partial integration), moreover they are commonly designed for homogeneous information networks where all the vertices of a network are of the same type. In this paper, we propose a deep network representation model that integrates network structure and the vertex content information into a unified framework by exploiting generative adversarial network, and represents different types of vertices in the heterogeneous network in a continuous and common vector space. Based on the proposed model, we can obtain heterogeneous bibliographic network representation for efficient citation recommendation. The proposed model also makes personalized citation recommendation possible, which is a new issue that a few papers addressed in the past. When evaluated on the AAN and DBLP datasets, the performance of the proposed heterogeneous bibliographic network based citation recommendation approach is comparable with that of the other network representation based citation recommendation approaches. The results also demonstrate that the personalized citation recommendation approach is more effective than the non-personalized citation recommendation approach.
Dictionary Learning in Optimal Metric Space
Yan, Jiexi (Xidian University) | Deng, Cheng (Xidian University) | Liu, Xianglong (Beihang University)
Dictionary learning has been widely used in machine learning field to address many real-world applications, such as classification and denoising. In recent years, many new dictionary learning methods have been proposed. Most of them are designed to solve unsupervised problem without any prior information or supervised problem with the label information. But in real world, as usual, we can only obtain limited side information as prior information rather than label information. The existing methods don’t take into account the side information, let alone learning a good dictionary through using the side information. To tackle it, we propose a new unified unsupervised model which naturally integrates metric learning to enhance dictionary learning model with fully utilizing the side information. The proposed method updates metric space and dictionary adaptively and alternatively, which ensures learning optimal metric space and dictionary simultaneously. Besides, our method can also deal well with highdimensional data. Extensive experiments show the efficiency of our proposed method, and a better performance can be derived in real-world image clustering applications.