AITopics

1805.07113

Country:

North America > United States (0.46)
Europe > United Kingdom (0.46)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

arXiv.org Artificial IntelligenceMay-7-2018

Improving Knowledge Graph Embedding Using Simple Constraints

Ding, Boyang, Wang, Quan, Wang, Bin, Guo, Li

Embedding knowledge graphs (KGs) into continuous vector spaces is a focus of current research. Early works performed this task via simple models developed over KG triples. Recent attempts focused on either designing more complicated triple scoring models, or incorporating extra information beyond triples. This paper, by contrast, investigates the potential of using very simple constraints to improve KG embedding. We examine non-negativity constraints on entity representations and approximate entailment constraints on relation representations. The former help to learn compact and interpretable representations for entities. The latter further encode regularities of logical entailment between relations into their distributed representations. These constraints impose prior beliefs upon the structure of the embedding space, without negative impacts on efficiency or scalability. Evaluation on WordNet, Freebase, and DBpedia shows that our approach is simple yet surprisingly effective, significantly and consistently outperforming competitive baselines. The constraints imposed indeed improve model interpretability, leading to a substantially increased structuring of the embedding space. Code and data are available at https://github.com/iieir-km/ComplEx-NNE_AER.

artificial intelligence, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

1805.02408

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Jo, Hwiyeol, Choi, Stanley Jungkyu

Extrofitting: Enriching Word Representation and its Vector Space with Semantic Lexicons

arXiv.org Artificial IntelligenceApr-21-2018

We propose post-processing method for enriching not only word representation but also its vector space using semantic lexicons, which we call extrofitting. The method consists of 3 steps as follows: (i) Expanding 1 or more dimension(s) on all the word vectors, filling with their representative value. (ii) Transferring semantic knowledge by averaging each representative values of synonyms and filling them in the expanded dimension(s). These two steps make representations of the synonyms close together. (iii) Projecting the vector space using Linear Discriminant Analysis, which eliminates the expanded dimension(s) with semantic knowledge. When experimenting with GloVe, we find that our method outperforms Faruqui's retrofitting on some of word similarity task. We also report further analysis on our method in respect to word vector dimensions, vocabulary size as well as other well-known pretrained word vectors (e.g., Word2Vec, Fasttext).

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1804.07946

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.85)

Madhawa, Kaushalya, Murata, Tsuyoshi

Exploring Partially Observed Networks with Nonparametric Bandits

arXiv.org Machine LearningApr-19-2018

Real-world networks such as social and communication networks are too large to be observed entirely. Such networks are often partially observed such that network size, network topology, and nodes of the original network are unknown. In this paper we formalize the Adaptive Graph Exploring problem. We assume that we are given an incomplete snapshot of a large network and additional nodes can be discovered by querying nodes in the currently observed network. The goal of this problem is to maximize the number of observed nodes within a given query budget. Querying which set of nodes maximizes the size of the observed network? We formulate this problem as an exploration-exploitation problem and propose a novel nonparametric multi-arm bandit (MAB) algorithm for identifying which nodes to be queried. Our contributions include: (1) $i$KNN-UCB, a novel nonparametric MAB algorithm, applies $k$-nearest neighbor UCB to the setting when the arms are presented in a vector space, (2) provide theoretical guarantee that $i$KNN-UCB algorithm has sublinear regret, and (3) applying $i$KNN-UCB algorithm on synthetic networks and real-world networks from different domains, we show that our method discovers up to 40% more nodes compared to existing baselines.

data mining, machine learning, node, (19 more...)

1804.07059

Genre: Research Report (0.40)

Industry:

Information Technology (0.48)
Energy (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.52)
(2 more...)

Tosh, Christopher, Dasgupta, Sanjoy

Structural query-by-committee

arXiv.org Machine LearningMar-17-2018

We introduce interactive structure learning, an abstract problem that encompasses many interactive learning tasks that have traditionally been studied in isolation, including active learning of binary classifiers, interactive clustering, interactive embedding, and active learning of structured output predictors. These problems include variants of both supervised and unsupervised tasks, and allow many different types of feedback, from binary labels to must-link/cannot-link constraints to similarity assessments to structured outputs. Despite these surface differences, they conform to a common template that allows them to be fruitfully unified. In interactive structure learning, there is a space of items X --for instance, an input space on which a classifier is to be learned, or points to cluster, or points to embed in a metric space--and the goal is to learn a structure on X, chosen from a family G. This set G could consist, for example, of all linear classifiers on X, or all hierarchical clusterings of X, or all knowledge graphs on X.

artificial intelligence, machine learning, structural qbc, (15 more...)

1803.06586

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Asia > Japan > Honshū > Tōhoku (0.04)

Genre: Research Report (0.63)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Aoshima, Tatsuhiro, Kobayashi, Kei, Minami, Mihoko

Revisiting the Vector Space Model: Sparse Weighted Nearest-Neighbor Method for Extreme Multi-Label Classification

arXiv.org Machine LearningFeb-12-2018

Machine learning has played an important role in information retrieval (IR) in recent times. In search engines, for example, query keywords are accepted and documents are returned in order of relevance to the given query; this can be cast as a multi-label ranking problem in machine learning. Generally, the number of candidate documents is extremely large (from several thousand to several million); thus, the classifier must handle many labels. This problem is referred to as extreme multi-label classification (XMLC). In this paper, we propose a novel approach to XMLC termed the Sparse Weighted Nearest-Neighbor Method. This technique can be derived as a fast implementation of state-of-the-art (SOTA) one-versus-rest linear classifiers for very sparse datasets. In addition, we show that the classifier can be written as a sparse generalization of a representer theorem with a linear kernel. Furthermore, our method can be viewed as the vector space model used in IR. Finally, we show that the Sparse Weighted Nearest-Neighbor Method can process data points in real time on XMLC datasets with equivalent performance to SOTA models, with a single thread and smaller storage footprint. In particular, our method exhibits superior performance to the SOTA models on a dataset with 3 million labels.

artificial intelligence, data entry, machine learning, (14 more...)

1802.03938

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.73)

Wang, Shaonan (Institute of Automation, Chinese Academy of Sciences) | Zhang, Jiajun (Institute of Automation, Chinese Academy of Sciences) | Lin, Nan (Institute of Psychology, Chinese Academy of Sciences) | Zong, Chengqing (Institute of Automation, Chinese Academy of Sciences)

Investigating Inner Properties of Multimodal Representation and Semantic Compositionality With Brain-Based Componential Semantics

Multimodal models have been proven to outperform text-based approaches on learning semantic representations. However, it still remains unclear what properties are encoded in multimodal representations, in what aspects do they outperform the single-modality representations, and what happened in the process of semantic compositionality in different input modalities. Considering that multimodal models are originally motivated by human concept representations, we assume that correlating multimodal representations with brain-based semantics would interpret their inner properties to answer the above questions. To that end, we propose simple interpretation methods based on brain-based componential semantics. First we investigate the inner properties of multimodal representations by correlating them with corresponding brain-based property vectors. Then we map the distributed vector space to the interpretable brain-based componential space to explore the inner properties of semantic compositionality. Ultimately, the present paper sheds light on the fundamental questions of natural language understanding, such as how to represent the meaning of words and how to combine word meanings into larger units.

composition model, representation, vector, (14 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.47)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.35)

Diagnosing University Student Subject Proficiency and Predicting Degree Completion in Vector Space

Luo, Yuetian (UW-Madison) | Pardos, Zachary A. (UC Berkeley)

We investigate the issues of undergraduate on-time graduation with respect to subject proficiencies through the lens of representation learning, training a student vector embeddings from a dataset of 8 years of course enrollments. We compare the per-semester student representations of a cohort of undergraduate Integrative Biology majors to those of graduated students in subject areas involved in their degree requirements. The result is an embedding rich in information about the relationships between majors and pathways taken by students which encoded enough information to improve prediction accuracy of on-time graduation to 95%, up from a baseline of 87.3%. Challenges to preparation of the data for student vectorization and sourcing of validation sets for optimization are discussed.

artificial intelligence, machine learning, natural language, (18 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Europe (0.68)
North America > United States > California (0.28)

Genre:

Instructional Material > Course Syllabus & Notes (0.89)
Instructional Material > Online (0.66)

Industry:

Education > Educational Setting > Higher Education (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Cai, Xiaoyan (School of Automation, Northwestern Polytechnical University) | Han, Junwei (School of Automation, Northwestern Polytechnical University) | Yang, Libin (School of Automation, Northwestern Polytechnical University)

Generative Adversarial Network Based Heterogeneous Bibliographic Network Representation for Personalized Citation Recommendation

Network representation has been recently exploited for many applications, such as citation recommendation, multi-label classification and link prediction. It learns low-dimensional vector representation for each vertex in networks. Existing network representation methods only focus on incomplete aspects of vertex information (i.e., vertex content, network structure or partial integration), moreover they are commonly designed for homogeneous information networks where all the vertices of a network are of the same type. In this paper, we propose a deep network representation model that integrates network structure and the vertex content information into a unified framework by exploiting generative adversarial network, and represents different types of vertices in the heterogeneous network in a continuous and common vector space. Based on the proposed model, we can obtain heterogeneous bibliographic network representation for efficient citation recommendation. The proposed model also makes personalized citation recommendation possible, which is a new issue that a few papers addressed in the past. When evaluated on the AAN and DBLP datasets, the performance of the proposed heterogeneous bibliographic network based citation recommendation approach is comparable with that of the other network representation based citation recommendation approaches. The results also demonstrate that the personalized citation recommendation approach is more effective than the non-personalized citation recommendation approach.

artificial intelligence, machine learning, representation, (17 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States > California (0.28)

Industry:

Health & Medicine (0.48)
Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Dictionary Learning in Optimal Metric Space

Yan, Jiexi (Xidian University) | Deng, Cheng (Xidian University) | Liu, Xianglong (Beihang University)

Dictionary learning has been widely used in machine learning field to address many real-world applications, such as classification and denoising. In recent years, many new dictionary learning methods have been proposed. Most of them are designed to solve unsupervised problem without any prior information or supervised problem with the label information. But in real world, as usual, we can only obtain limited side information as prior information rather than label information. The existing methods don’t take into account the side information, let alone learning a good dictionary through using the side information. To tackle it, we propose a new unified unsupervised model which naturally integrates metric learning to enhance dictionary learning model with fully utilizing the side information. The proposed method updates metric space and dictionary adaptively and alternatively, which ensures learning optimal metric space and dictionary simultaneously. Besides, our method can also deal well with highdimensional data. Extensive experiments show the efficiency of our proposed method, and a better performance can be derived in real-world image clustering applications.

artificial intelligence, learning, machine learning, (15 more...)

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.85)