Jia, Yantao
OntoZSL: Ontology-enhanced Zero-shot Learning
Geng, Yuxia, Chen, Jiaoyan, Chen, Zhuo, Pan, Jeff Z., Ye, Zhiquan, Yuan, Zonggang, Jia, Yantao, Chen, Huajun
Zero-shot Learning (ZSL), which aims to predict for those classes that have never appeared in the training data, has arisen hot research interests. The key of implementing ZSL is to leverage the prior knowledge of classes which builds the semantic relationship between classes and enables the transfer of the learned models (e.g., features) from training classes (i.e., seen classes) to unseen classes. However, the priors adopted by the existing methods are relatively limited with incomplete semantics. In this paper, we explore richer and more competitive prior knowledge to model the inter-class relationship for ZSL via ontology-based knowledge representation and semantic embedding. Meanwhile, to address the data imbalance between seen classes and unseen classes, we developed a generative ZSL framework with Generative Adversarial Networks (GANs). Our main findings include: (i) an ontology-enhanced ZSL framework that can be applied to different domains, such as image classification (IMGC) and knowledge graph completion (KGC); (ii) a comprehensive evaluation with multiple zero-shot datasets from different domains, where our method often achieves better performance than the state-of-the-art models. In particular, on four representative ZSL baselines of IMGC, the ontology-based class semantics outperform the previous priors e.g., the word embeddings of classes by an average of 12.4 accuracy points in the standard ZSL across two example datasets (see Figure 4).
The Devil is the Classifier: Investigating Long Tail Relation Classification with Decoupling Analysis
Yu, Haiyang, Zhang, Ningyu, Deng, Shumin, Yuan, Zonggang, Jia, Yantao, Chen, Huajun
Long-tailed relation classification is a challenging problem as the head classes may dominate the training phase, thereby leading to the deterioration of the tail performance. Existing solutions usually address this issue via class-balancing strategies, e.g., data re-sampling and loss re-weighting, but all these methods adhere to the schema of entangling learning of the representation and classifier. In this study, we conduct an in-depth empirical investigation into the long-tailed problem and found that pre-trained models with instance-balanced sampling already capture the well-learned representations for all classes; moreover, it is possible to achieve better long-tailed classification ability at low cost by only adjusting the classifier. Inspired by this observation, we propose a robust classifier with attentive relation routing, which assigns soft weights by automatically aggregating the relations. Extensive experiments on two datasets demonstrate the effectiveness of our proposed approach. Code and datasets are available in https://github.com/zjunlp/deepke.
Path-Based Attention Neural Model for Fine-Grained Entity Typing
Zhang, Denghui (Institute of Computing Technology, Chinese Academy of Sciences) | Li, Manling (Institute of Computing Technology, Chinese Academy of Sciences) | Cai, Pengshan (University of Massachusetts Amherst) | Jia, Yantao (Institute of Computing Technology, Chinese Academy of Sciences) | Wang, Yuanzhuo (Institute of Computing Technology, Chinese Academy of Sciences)
Fine-grained entity typing aims to assign entity mentions in the free text with types arranged in a hierarchical structure. It suffers from the label noise in training data generated by distant supervision. Although recent studies use many features to prune wrong label ahead of training, they suffer from error propagation and bring much complexity. In this paper, we propose an end-to-end typing model, called the path-based attention neural model (PAN), to learn a noise-robust performance by leveraging the hierarchical structure of types. Experiments on two data sets demonstrate its effectiveness.
Efficient Parallel Translating Embedding For Knowledge Graphs
Zhang, Denghui, Li, Manling, Jia, Yantao, Wang, Yuanzhuo, Cheng, Xueqi
Knowledge graph embedding aims to embed entities and relations of knowledge graphs into low-dimensional vector spaces. Translating embedding methods regard relations as the translation from head entities to tail entities, which achieve the state-of-the-art results among knowledge graph embedding methods. However, a major limitation of these methods is the time consuming training process, which may take several days or even weeks for large knowledge graphs, and result in great difficulty in practical applications. In this paper, we propose an efficient parallel framework for translating embedding methods, called ParTrans-X, which enables the methods to be paralleled without locks by utilizing the distinguished structures of knowledge graphs. Experiments on two datasets with three typical translating embedding methods, i.e., TransE [3], TransH [17], and a more efficient variant TransE- AdaGrad [10] validate that ParTrans-X can speed up the training process by more than an order of magnitude.
Learning Knowledge Representation Across Knowledge Graphs
Cai, Pengshan (Institute of Computing Technology, Chinese Academy of Sciences) | Li, Wei (Institute of Computing Technology, Chinese Academy of Sciences) | Feng, Yansong (Peking University) | Wang, Yuanzhuo (Institute of Computing Technology, Chinese Academy of Sciences) | Jia, Yantao (Institute of Computing Technology, Chinese Academy of Sciences)
Distributed knowledge representation learning (KRL) methods encode both entities and relations in knowledge graphs (KG) in a lower-dimensional semantic space, which model relatively dense knowledge graphs well and greatly improve the performance of knowledge graph completion and knowledge reasoning. However, existing KRL methods including Trans(E, H, R, D and Sparse) hardly obtain comparative performances on sparse KGs where most of entities and relations have very low frequencies. Furthermore, all existing methods target at KRL on one knowledge graph independently. The embeddings of different KGs are independent with each other. In this paper, we propose a novel cross-knowledge-graph (cross-KG) KRL method which learns embeddings for two different KGs simultaneously. Through projecting semantic related entities and relations in two KGs to a uniform semantic space, our method could learn better embeddings for sparse KGs by incorporating information from another relatively larger and denser KG. The learned embeddings are also helpful for downstream cross-KGs or cross-linguals tasks like ontology alignment. The experiment results show that our method could significantly outperform corresponding baseline methods on knowledge graph completion on single KG and cross-KG entity prediction and mapping tasks.
Predicting Links and Their Building Time: A Path-Based Approach
Li, Manling (Institute of Computing Technology, Chinese Academy of Sciences) | Jia, Yantao (Institute of Computing Technology, Chinese Academy of Sciences) | Wang, Yuanzhuo (Institute of Computing Technology, Chinese Academy of Sciences) | Zhao, Zeya (Institute of Computing Technology, Chinese Academy of Sciences) | Cheng, Xueqi (Institute of Computing Technology, Chinese Academy of Sciences)
Predicting links and their building time in a knowledge network has been extensively studied in recent years. Most structure-based predictive methods consider structures and the time information of edges separately, which fail to characterize the correlation between them. In this paper, we propose a structure called the Time-Difference-Labeled Path, and a link prediction method (TDLP). Experiments show that TDLP outperforms the state-of-the-art methods.
Locally Adaptive Translation for Knowledge Graph Embedding
Jia, Yantao (Institute of Computing Technology, Chinese Academy of Science) | Wang, Yuanzhuo (Institute of Computing Technology, Chinese Academy of Science) | Lin, Hailun (Institute of Information Engineering, Chinese Academy of Science) | Jin, Xiaolong (Institute of Computing Technology, Chinese Academy of Science) | Cheng, Xueqi (Institute of Computing Technology, Chinese Academy of Science)
Knowledge graph embedding aims to represent entities and relations in a large-scale knowledge graph as elements in a continuous vector space. Existing methods, e.g., TransE and TransH, learn embedding representation by defining a global margin-based loss function over the data. However, the optimal loss function is determined during experiments whose parameters are examined among a closed set of candidates. Moreover, embeddings over two knowledge graphs with different entities and relations share the same set of candidate loss functions, ignoring the locality of both graphs. This leads to the limited performance of embedding related applications. In this paper, we propose a locally adaptive translation method for knowledge graph embedding, called TransA, to find the optimal loss function by adaptively determining its margin over different knowledge graphs. Experiments on two benchmark data sets demonstrate the superiority of the proposed method, as compared to the-state-of-the-art ones.
Content-Structural Relation Inference in Knowledge Base
Zhao, Zeya (Chinese Academy of Sciences) | Jia, Yantao (Chinese Academy of Sciences) | Wang, Yuanzhuo
Relation inference between concepts in knowledge base has been extensively studied in recent years. Previous methods mostly apply the relations in the knowledge base, without fully utilizing the contents, i.e., the attributes of concepts in knowledge base. In this paper, we propose a content-structural relation inference method (CSRI) which integrates the content and structural information between concepts for relation inference. Experiments on data sets show that CSRI obtains 15% improvement compared with the state-of-the-art methods.
LSDH: A Hashing Approach for Large-Scale Link Prediction in Microblogs
Liu, Dawei (Chinese Academy of Sciences) | Wang, Yuanzhuo (Chinese Academy of Sciences) | Jia, Yantao (Chinese Academy of Sciences) | Li, Jingyuan (Chinese Academy of Sciences) | Yu, Zhihua (Chinese Academy of Sciences)
One challenge of link prediction in online social networks is the large scale of many such networks. The measures used by existing work lack a computational consideration in the large scale setting. We propose the notion of social distance in a multi-dimensional form to measure the closeness among a group of people in Microblogs. We proposed a fast hashing approach called Locality-sensitive Social Distance Hashing (LSDH), which works in an unsupervised setup and performs approximate near neighbor search without high-dimensional distance computation. Experiments were applied over a Twitter dataset and the preliminary results testified the effectiveness of LSDH in predicting the likelihood of future associations between people.