Semantic Networks
Functional Task Tree Generation from a Knowledge Graph to Solve Unseen Problems
Sakib, Md. Sadman, Paulius, David, Sun, Yu
A major component for developing intelligent and autonomous robots is a suitable knowledge representation, from which a robot can acquire knowledge about its actions or world. However, unlike humans, robots cannot creatively adapt to novel scenarios, as their knowledge and environment are rigidly defined. To address the problem of producing novel and flexible task plans called task trees, we explore how we can derive plans with concepts not originally in the robot's knowledge base. Existing knowledge in the form of a knowledge graph is used as a base of reference to create task trees that are modified with new object or state combinations. To demonstrate the flexibility of our method, we randomly selected recipes from the Recipe1M+ dataset and generated their task trees. The task trees were then thoroughly checked with a visualization tool that portrays how each ingredient changes with each action to produce the desired meal. Our results indicate that the proposed method can produce task plans with high accuracy even for never-before-seen ingredient combinations.
Survey on English Entity Linking on Wikidata
Mรถller, Cedric, Lehmann, Jens, Usbeck, Ricardo
Wikidata is a frequently updated, community-driven, and multilingual knowledge graph. Hence, Wikidata is an attractive basis for Entity Linking, which is evident by the recent increase in published papers. This survey focuses on four subjects: (1) Which Wikidata Entity Linking datasets exist, how widely used are they and how are they constructed? (2) Do the characteristics of Wikidata matter for the design of Entity Linking datasets and if so, how? (3) How do current Entity Linking approaches exploit the specific characteristics of Wikidata? (4) Which Wikidata characteristics are unexploited by existing Entity Linking approaches? This survey reveals that current Wikidata-specific Entity Linking datasets do not differ in their annotation scheme from schemes for other knowledge graphs like DBpedia. Thus, the potential for multilingual and time-dependent datasets, naturally suited for Wikidata, is not lifted. Furthermore, we show that most Entity Linking approaches use Wikidata in the same way as any other knowledge graph missing the chance to leverage Wikidata-specific characteristics to increase quality. Almost all approaches employ specific properties like labels and sometimes descriptions but ignore characteristics such as the hyper-relational structure. Hence, there is still room for improvement, for example, by including hyper-relational graph embeddings or type information. Many approaches also include information from Wikipedia, which is easily combinable with Wikidata and provides valuable textual information, which Wikidata lacks.
EngineKGI: Closed-Loop Knowledge Graph Inference
Niu, Guanglin, Li, Bo, Zhang, Yongfei, Pu, Shiliang
Knowledge Graph (KG) inference is the vital technique to address the natural incompleteness of KGs. The existing KG inference approaches can be classified into rule learning-based and KG embedding-based models. However, these approaches cannot well balance accuracy, generalization, interpretability and efficiency, simultaneously. Besides, these models always rely on pure triples and neglect additional information. Therefore, both KG embedding (KGE) and rule learning KG inference approaches face challenges due to the sparse entities and the limited semantics. We propose a novel and effective closed-loop KG inference framework EngineKGI operating similarly as an engine based on these observations. EngineKGI combines KGE and rule learning to complement each other in a closed-loop pattern while taking advantage of semantics in paths and concepts. KGE module exploits paths to enhance the semantic association between entities and introduces rules for interpretability. A novel rule pruning mechanism is proposed in the rule learning module by leveraging paths as initial candidate rules and employing KG embeddings together with concepts for extracting more high-quality rules. Experimental results on four real-world datasets show that our model outperforms other baselines on link prediction tasks, demonstrating the effectiveness and superiority of our model on KG inference in a joint logic and data-driven fashion with a closed-loop mechanism.
MDistMult: A Multiple Scoring Functions Model for Link Prediction on Antiviral Drugs Knowledge Graph
Wang, Weichuan, Xie, Zhiwen, Liu, Jin, Duan, Yucong, Huang, Bo, Zhang, Junsheng
Knowledge graphs (KGs) on COVID-19 have been constructed to accelerate the research process of COVID-19. However, KGs are always incomplete, especially the new constructed COVID-19 KGs. Link prediction task aims to predict missing entities for (e, r, t) or (h, r, e), where h and t are certain entities, e is an entity that needs to be predicted and r is a relation. This task also has the potential to solve COVID-19 related KGs' incomplete problem. Although various knowledge graph embedding (KGE) approaches have been proposed to the link prediction task, these existing methods suffer from the limitation of using a single scoring function, which fails to capture rich features of COVID-19 KGs. In this work, we propose the MDistMult model that leverages multiple scoring functions to extract more features from existing triples. We employ experiments on the CCKS2020 COVID-19 Antiviral Drugs Knowledge Graph (CADKG). The experimental results demonstrate that our MDistMult achieves state-of-the-art performance in link prediction task on the CADKG dataset
Improving Machine Learning: How Knowledge Graphs Bring Deeper Meaning to Data
Enterprise machine learning deployments are limited by two consequences of outdated data management practices widely used today. The first is the protracted time-to-insight that stems from antiquated data replication approaches. The second is the lack of unified, contextualized data that spans the organization horizontally. Excessive data replication and the resulting "second-order effects" are creating enormous efficiencies and waste for data scientists in most organizations. According to IDC, over 60 zettabytes of data were produced last year, and this is forecast to increase at a CAGR of 23 percent until 2025.
Construct A Biomedical Knowledge Graph With NLP
I have already demonstrated how to create a knowledge graph out of a Wikipedia page. However, since the post got a lot of attention, I've decided to explore other domains where using NLP techniques to construct a knowledge graph makes sense. In my opinion, the biomedical field is a prime example where representing the data as a graph makes sense as you are often analyzing interactions and relations between genes, diseases, drugs, proteins, and more. In the above visualization, we have ascorbic acid, also known as vitamin C, and some of its relations to other concepts. For example, it shows that vitamin C could be used to treat chronic gastritis.
Neuro-Symbolic AI: The Peak of Artificial Intelligence
Neuro-Symbolic AI, which is alternatively called composite AI, is a relatively new term for a well-established concept with enormous significance for almost any enterprise application of Artificial Intelligence. By combining AI's statistical foundation (exemplified by machine learning) with its knowledge foundation (exemplified by knowledge graphs and rules), organizations get the most effective cognitive analytics results with the least amount of headaches--and cost. Pairing these two historical pillars of AI is essential to maximizing investments in these technologies and in data themselves. By itself, rules-based symbolic reasoning doesn't improve over time. Together, these AI approaches create total machine intelligence with logic-based systems that get better with each application.
Building a Knowledge Graph for Job Search using BERT Transformer
While the natural language processing (NLP) field has been growing at an exponential rate for the last two years -- thanks to the development of transfer-based models -- their applications have been limited in scope for the job search field. LinkedIn, the leading company in job search and recruitment, is a good example. While I hold a Ph.D. in Material Science and a Master in Physics, I am receiving job recommendations such as Technical Program Manager at MongoDB and a Go Developer position at Toptal which are both web developing companies that are not relevant to my background. This feeling of irrelevancy is shared by many users and is a cause of big frustration. Job seekers should have access to the best tools to help them find the perfect match to their profile without wasting time in irrelevant recommendations and manual searches...
Knowledge Graphs @ EMNLP 2021
If you are an experienced reader of such digests (or previous posts) then you know pretty well the abundance of KG-augmented LMs published at every conference and uploaded to arxiv weekly. If you feel lost -- I can assure you're not the only one. This year, we finally have a sound framework and taxonomy of various KG LM approaches! The authors define 3 big families: 1 no KG supervision, probing knowledge encoded in LM params with cloze-style prompts; 2 KG supervision with entities and IDs; 3 KG supervision with relation templates and surface forms. Each family has a few branches For instance, let's have a look at 4 entity-aware models illustrated below.
Auto-Encoding Knowledge Graph for Unsupervised Medical Report Generation
Liu, Fenglin, You, Chenyu, Wu, Xian, Ge, Shen, Wang, Sheng, Sun, Xu
Medical report generation, which aims to automatically generate a long and coherent report of a given medical image, has been receiving growing research interests. Existing approaches mainly adopt a supervised manner and heavily rely on coupled image-report pairs. However, in the medical domain, building a large-scale image-report paired dataset is both time-consuming and expensive. To relax the dependency on paired data, we propose an unsupervised model Knowledge Graph Auto-Encoder (KGAE) which accepts independent sets of images and reports in training. KGAE consists of a pre-constructed knowledge graph, a knowledge-driven encoder and a knowledge-driven decoder. The knowledge graph works as the shared latent space to bridge the visual and textual domains; The knowledge-driven encoder projects medical images and reports to the corresponding coordinates in this latent space and the knowledge-driven decoder generates a medical report given a coordinate in this space. Since the knowledge-driven encoder and decoder can be trained with independent sets of images and reports, KGAE is unsupervised. The experiments show that the unsupervised KGAE generates desirable medical reports without using any image-report training pairs. Moreover, KGAE can also work in both semi-supervised and supervised settings, and accept paired images and reports in training. By further fine-tuning with image-report pairs, KGAE consistently outperforms the current state-of-the-art models on two datasets.