Semantic Networks
Google starts displaying contextual info in image searches
The next time you search for and tap on an image on Google, you may see some helpful information related to what's on your screen. The company is now more deeply integrating its Knowledge Graph with pictures that it finds online. Say you're paging through photos of famous buildings as in the GIF above, you'll see a new element of the interface that highlights people, places or things related to the current picture. You can then tap on these to find out more information about them. As usual, you'll also see prompts for related searches. If you've ever searched for something and seen a panel to the side of the main interface that displays some facts related to your query, then you've seen the Knowledge Graph in action.
Coronavirus Knowledge Graph: A Case Study
Chen, Chongyan, Ebeid, Islam Akef, Bu, Yi, Ding, Ying
The emergence of the novel COVID-19 pandemic has had a significant impact on global healthcare and the economy over the past few months. The virus's rapid widespread has led to a proliferation in biomedical research addressing the pandemic and its related topics. One of the essential Knowledge Discovery tools that could help the biomedical research community understand and eventually find a cure for COVID-19 are Knowledge Graphs. The CORD-19 dataset is a collection of publicly available full-text research articles that have been recently published on COVID-19 and coronavirus topics. Here, we use several Machine Learning, Deep Learning, and Knowledge Graph construction and mining techniques to formalize and extract insights from the PubMed dataset and the CORD-19 dataset to identify COVID-19 related experts and bio-entities. Besides, we suggest possible techniques to predict related diseases, drug candidates, gene, gene mutations, and related compounds as part of a systematic effort to apply Knowledge Discovery methods to help biomedical researchers tackle the pandemic.
COVID-KG uses AI to scan thousands of studies to answer doctors' coronavirus questions
The number of studies about COVID-19 has risen exponentially from the start of the pandemic, from around 20,000 in early March to over 30,000 as of late June. In an effort to help clinicians digest the vast amount of biomedical knowledge in the literature, researchers affiliated with Columbia, Brandeis, Darpa, UCLA, and UIUC developed a framework -- COVID-KG -- that draws on papers to answer natural language questions about drug purposing and more. The sheer volume of COVID-19 research makes it difficult to sort the wheat from the chaff. Some false information has been promoted on social media and in publication venues like journals. And many results about the virus from different labs and sources are redundant, complementary, or would appear to conflict.
TransINT: Embedding Implication Rules in Knowledge Graphs with Isomorphic Intersections of Linear Subspaces
Min, So Yeon, Raghavan, Preethi, Szolovits, Peter
Knowledge Graphs (KG), composed of entities and relations, provide a structured representation of knowledge. For easy access to statistical approaches on relational data, multiple methods to embed a KG into f(KG) $\in$ R^d have been introduced. We propose TransINT, a novel and interpretable KG embedding method that isomorphically preserves the implication ordering among relations in the embedding space. Given implication rules, TransINT maps set of entities (tied by a relation) to continuous sets of vectors that are inclusion-ordered isomorphically to relation implications. With a novel parameter sharing scheme, TransINT enables automatic training on missing but implied facts without rule grounding. On a benchmark dataset, we outperform the best existing state-of-the-art rule integration embedding methods with significant margins in link Prediction and triple Classification. The angles between the continuous sets embedded by TransINT provide an interpretable way to mine semantic relatedness and implication rules among relations.
Mobile Link Prediction: Automated Creation and Crowd-sourced Validation of Knowledge Graphs
Ballandies, Mark Christopher, Pournaras, Evangelos
Building trustworthy knowledge graphs for cyber-physical social systems (CPSS) is a challenge. In particular, current approaches relying on human experts have limited scalability, while automated approaches are often not accountable to users resulting in knowledge graphs of questionable quality. This paper introduces a novel pervasive knowledge graph builder that brings together automation, experts' and crowd-sourced citizens' knowledge. The knowledge graph grows via automated link predictions using genetic programming that are validated by humans for improving transparency and calibrating accuracy. The knowledge graph builder is designed for pervasive devices such as smartphones and preserves privacy by localizing all computations. The accuracy, practicality, and usability of the knowledge graph builder is evaluated in a real-world social experiment that involves a smartphone implementation and a Smart City application scenario. The proposed knowledge graph building methodology outperforms the baseline method in terms of accuracy while demonstrating its efficient calculations on smartphones and the feasibility of the pervasive human supervision process in terms of high interactions throughput. These findings promise new opportunities to crowd-source and operate pervasive reasoning systems for cyber-physical social systems in Smart Cities.
Building Rule Hierarchies for Efficient Logical Rule Learning from Knowledge Graphs
Gu, Yulong, Guan, Yu, Missier, Paolo
Many systems have been developed in recent years to mine logical rules from large-scale Knowledge Graphs (KGs), on the grounds that representing regularities as rules enables both the interpretable inference of new facts, and the explanation of known facts. Among these systems, the walk-based methods that generate the instantiated rules containing constants by abstracting sampled paths in KGs demonstrate strong predictive performance and expressivity. However, due to the large volume of possible rules, these systems do not scale well where computational resources are often wasted on generating and evaluating unpromising rules. In this work, we address such scalability issues by proposing new methods for pruning unpromising rules using rule hierarchies. The approach consists of two phases. Firstly, since rule hierarchies are not readily available in walk-based methods, we have built a Rule Hierarchy Framework (RHF), which leverages a collection of subsumption frameworks to build a proper rule hierarchy from a set of learned rules. And secondly, we adapt RHF to an existing rule learner where we design and implement two methods for Hierarchical Pruning (HPMs), which utilize the generated hierarchies to remove irrelevant and redundant rules. Through experiments over four public benchmark datasets, we show that the application of HPMs is effective in removing unpromising rules, which leads to significant reductions in the runtime as well as in the number of learned rules, without compromising the predictive performance.
Multi-Partition Embedding Interaction with Block Term Format for Knowledge Graph Completion
Tran, Hung Nghiep, Takasu, Atsuhiro
Knowledge graph completion is an important task that aims to predict the missing relational link between entities. Knowledge graph embedding methods perform this task by representing entities and relations as embedding vectors and modeling their interactions to compute the matching score of each triple. Previous work has usually treated each embedding as a whole and has modeled the interactions between these whole embeddings, potentially making the model excessively expensive or requiring specially designed interaction mechanisms. In this work, we propose the multi-partition embedding interaction (MEI) model with block term format to systematically address this problem. MEI divides each embedding into a multi-partition vector to efficiently restrict the interactions. Each local interaction is modeled with the Tucker tensor format and the full interaction is modeled with the block term tensor format, enabling MEI to control the trade-off between expressiveness and computational cost, learn the interaction mechanisms from data automatically, and achieve state-of-the-art performance on the link prediction task. In addition, we theoretically study the parameter efficiency problem and derive a simple empirically verified criterion for optimal parameter trade-off. We also apply the framework of MEI to provide a new generalized explanation for several specially designed interaction mechanisms in previous models.
Adversarial Learning for Debiasing Knowledge Graph Embeddings
Arduini, Mario, Noci, Lorenzo, Pirovano, Federico, Zhang, Ce, Shrestha, Yash Raj, Paudel, Bibek
Knowledge Graphs (KG) are gaining increasing attention in both academia and industry. Despite their diverse benefits, recent research have identified social and cultural biases embedded in the representations learned from KGs. Such biases can have detrimental consequences on different population and minority groups as applications of KG begin to intersect and interact with social spheres. This paper aims at identifying and mitigating such biases in Knowledge Graph (KG) embeddings. As a first step, we explore popularity bias -- the relationship between node popularity and link prediction accuracy. In case of node2vec graph embeddings, we find that prediction accuracy of the embedding is negatively correlated with the degree of the node. However, in case of knowledge-graph embeddings (KGE), we observe an opposite trend. As a second step, we explore gender bias in KGE, and a careful examination of popular KGE algorithms suggest that sensitive attribute like the gender of a person can be predicted from the embedding. This implies that such biases in popular KGs is captured by the structural properties of the embedding. As a preliminary solution to debiasing KGs, we introduce a novel framework to filter out the sensitive attribute information from the KG embeddings, which we call FAN (Filtering Adversarial Network). We also suggest the applicability of FAN for debiasing other network embeddings which could be explored in future work.
Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings
Chang, David, Balazevic, Ivana, Allen, Carl, Chawla, Daniel, Brandt, Cynthia, Taylor, Richard Andrew
Much of biomedical and healthcare data is encoded in discrete, symbolic form such as text and medical codes. There is a wealth of expert-curated biomedical domain knowledge stored in knowledge bases and ontologies, but the lack of reliable methods for learning knowledge representation has limited their usefulness in machine learning applications. While text-based representation learning has significantly improved in recent years through advances in natural language processing, attempts to learn biomedical concept embeddings so far have been lacking. A recent family of models called knowledge graph embeddings have shown promising results on general domain knowledge graphs, and we explore their capabilities in the biomedical domain. We train several state-of-the-art knowledge graph embedding models on the SNOMED-CT knowledge graph, provide a benchmark with comparison to existing methods and in-depth discussion on best practices, and make a case for the importance of leveraging the multi-relational nature of knowledge graphs for learning biomedical knowledge representation. The embeddings, code, and materials will be made available to the communitY.
AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types
Dong, Xin Luna, He, Xiang, Kan, Andrey, Li, Xian, Liang, Yan, Ma, Jun, Xu, Yifan Ethan, Zhang, Chenwei, Zhao, Tong, Saldana, Gabriel Blanco, Deshpande, Saurabh, Manduca, Alexandre Michetti, Ren, Jay, Singh, Surender Pal, Xiao, Fan, Chang, Haw-Shiuan, Karamanolakis, Giannis, Mao, Yuning, Wang, Yaqing, Faloutsos, Christos, McCallum, Andrew, Han, Jiawei
Can one build a knowledge graph (KG) for all products in the world? Knowledge graphs have firmly established themselves as valuable sources of information for search and question answering, and it is natural to wonder if a KG can contain information about products offered at online retail sites. There have been several successful examples of generic KGs, but organizing information about products poses many additional challenges, including sparsity and noise of structured data for products, complexity of the domain with millions of product types and thousands of attributes, heterogeneity across large number of categories, as well as large and constantly growing number of products. We describe AutoKnow, our automatic (self-driving) system that addresses these challenges. The system includes a suite of novel techniques for taxonomy construction, product property identification, knowledge extraction, anomaly detection, and synonym discovery. AutoKnow is (a) automatic, requiring little human intervention, (b) multi-scalable, scalable in multiple dimensions (many domains, many products, and many attributes), and (c) integrative, exploiting rich customer behavior logs. AutoKnow has been operational in collecting product knowledge for over 11K product types.