Goto

Collaborating Authors

 Semantic Networks


Towards a Path Dependent Account of Category Fluency

arXiv.org Artificial Intelligence

Category fluency is a widely studied cognitive phenomenon, yet two conflicting accounts have been proposed as the underlying retrieval mechanism -- an optimal foraging process deliberately searching through memory (Hills et al., 2012) and a random walk sampling from a semantic network (Abbott et al., 2015). Evidence for both accounts has centered around predicting human patch switches, where both existing models of category fluency produce paradoxically identical results. We begin by peeling back the assumptions made by existing models, namely that each named example only depends on the previous example, by (i) adding an additional bias to model the category transition probability directly and (ii) relying on a large language model to predict based on the entire existing sequence. Then, we present evidence towards resolving the disagreement between each account of foraging by reformulating models as sequence generators. To evaluate, we compare generated category fluency runs to a bank of human-written sequences by proposing a metric based on n-gram overlap. We find category switch predictors do not necessarily produce human-like sequences, in fact the additional biases used by the Hills et al. (2012) model are required to improve generation quality, which are later improved by our category modification. Even generating exclusively with an LLM requires an additional global cue to trigger the patch switching behavior during production. Further tests on only the search process on top of the semantic network highlight the importance of deterministic search to replicate human behavior.


CleanGraph: Human-in-the-loop Knowledge Graph Refinement and Completion

arXiv.org Artificial Intelligence

This paper presents CleanGraph, an interactive web-based tool designed to facilitate the refinement and completion of knowledge graphs. Maintaining the reliability of knowledge graphs, which are grounded in high-quality and error-free facts, is crucial for real-world applications such as question-answering and information retrieval systems. These graphs are often automatically assembled from textual sources by extracting semantic triples via information extraction. However, assuring the quality of these extracted triples, especially when dealing with large or low-quality datasets, can pose a significant challenge and adversely affect the performance of downstream applications. CleanGraph allows users to perform Create, Read, Update, and Delete (CRUD) operations on their graphs, as well as apply models in the form of plugins for graph refinement and completion tasks. These functionalities enable users to enhance the integrity and reliability of their graph data. A demonstration of CleanGraph and its source code can be accessed at https://github.com/nlp-tlp/CleanGraph under the MIT License.


Towards Continual Knowledge Graph Embedding via Incremental Distillation

arXiv.org Artificial Intelligence

Traditional knowledge graph embedding (KGE) methods typically require preserving the entire knowledge graph (KG) with significant training costs when new knowledge emerges. To address this issue, the continual knowledge graph embedding (CKGE) task has been proposed to train the KGE model by learning emerging knowledge efficiently while simultaneously preserving decent old knowledge. However, the explicit graph structure in KGs, which is critical for the above goal, has been heavily ignored by existing CKGE methods. On the one hand, existing methods usually learn new triples in a random order, destroying the inner structure of new KGs. On the other hand, old triples are preserved with equal priority, failing to alleviate catastrophic forgetting effectively. In this paper, we propose a competitive method for CKGE based on incremental distillation (IncDE), which considers the full use of the explicit graph structure in KGs. First, to optimize the learning order, we introduce a hierarchical strategy, ranking new triples for layer-by-layer learning. By employing the inter- and intra-hierarchical orders together, new triples are grouped into layers based on the graph structure features. Secondly, to preserve the old knowledge effectively, we devise a novel incremental distillation mechanism, which facilitates the seamless transfer of entity representations from the previous layer to the next one, promoting old knowledge preservation. Finally, we adopt a two-stage training paradigm to avoid the over-corruption of old knowledge influenced by under-trained new knowledge. Experimental results demonstrate the superiority of IncDE over state-of-the-art baselines. Notably, the incremental distillation mechanism contributes to improvements of 0.2%-6.5% in the mean reciprocal rank (MRR) score.


Multi-level Shared Knowledge Guided Learning for Knowledge Graph Completion

arXiv.org Artificial Intelligence

In the task of Knowledge Graph Completion (KGC), the existing datasets and their inherent subtasks carry a wealth of shared knowledge that can be utilized to enhance the representation of knowledge triplets and overall performance. However, no current studies specifically address the shared knowledge within KGC. To bridge this gap, we introduce a multi-level Shared Knowledge Guided learning method (SKG) that operates at both the dataset and task levels. On the dataset level, SKG-KGC broadens the original dataset by identifying shared features within entity sets via text summarization. On the task level, for the three typical KGC subtasks - head entity prediction, relation prediction, and tail entity prediction - we present an innovative multi-task learning architecture with dynamically adjusted loss weights. This approach allows the model to focus on more challenging and underperforming tasks, effectively mitigating the imbalance of knowledge sharing among subtasks. Experimental results demonstrate that SKG-KGC outperforms existing text-based methods significantly on three well-known datasets, with the most notable improvement on WN18RR.


Relations Prediction for Knowledge Graph Completion using Large Language Models

arXiv.org Artificial Intelligence

Knowledge Graphs (KGs) are used to store semantic data in the form of entity nodes and edges. The edges represent the directed relations between the entities. For instance, the fact that James Cameron produced Avatar can be stored in a KG by having James Cameron node linked to Avatar node using the relation (edge) produced. This knowledge representation structure is called a triple; it consists of a head node, a relation, and a tail node. The direction in this triple is important to keep the validity of the fact if the nodes that surround the edge are swapped.


Knowledge Graph Extension by Entity Type Recognition

arXiv.org Artificial Intelligence

Knowledge graphs have emerged as a sophisticated advancement and refinement of semantic networks, and their deployment is one of the critical methodologies in contemporary artificial intelligence. The construction of knowledge graphs is a multifaceted process involving various techniques, where researchers aim to extract the knowledge from existing resources for the construction since building from scratch entails significant labor and time costs. However, due to the pervasive issue of heterogeneity, the description diversity across different knowledge graphs can lead to mismatches between concepts, thereby impacting the efficacy of knowledge extraction. This Ph.D. study focuses on automatic knowledge graph extension, i.e., properly extending the reference knowledge graph by extracting and integrating concepts from one or more candidate knowledge graphs. We propose a novel knowledge graph extension framework based on entity type recognition. The framework aims to achieve high-quality knowledge extraction by aligning the schemas and entities across different knowledge graphs, thereby enhancing the performance of the extension. This paper elucidates three major contributions: (i) we propose an entity type recognition method exploiting machine learning and property-based similarities to enhance knowledge extraction; (ii) we introduce a set of assessment metrics to validate the quality of the extended knowledge graphs; (iii) we develop a platform for knowledge graph acquisition, management, and extension to benefit knowledge engineers practically. Our evaluation comprehensively demonstrated the feasibility and effectiveness of the proposed extension framework and its functionalities through quantitative experiments and case studies.


Subgraph2vec: A random walk-based algorithm for embedding knowledge graphs

arXiv.org Artificial Intelligence

Graph is an important data representation which occurs naturally in the real world applications \cite{goyal2018graph}. Therefore, analyzing graphs provides users with better insights in different areas such as anomaly detection \cite{ma2021comprehensive}, decision making \cite{fan2023graph}, clustering \cite{tsitsulin2023graph}, classification \cite{wang2021mixup} and etc. However, most of these methods require high levels of computational time and space. We can use other ways like embedding to reduce these costs. Knowledge graph (KG) embedding is a technique that aims to achieve the vector representation of a KG. It represents entities and relations of a KG in a low-dimensional space while maintaining the semantic meanings of them. There are different methods for embedding graphs including random walk-based methods such as node2vec, metapath2vec and regpattern2vec. However, most of these methods bias the walks based on a rigid pattern usually hard-coded in the algorithm. In this work, we introduce \textit{subgraph2vec} for embedding KGs where walks are run inside a user-defined subgraph. We use this embedding for link prediction and prove our method has better performance in most cases in comparison with the previous ones.


Arbitrary Time Information Modeling via Polynomial Approximation for Temporal Knowledge Graph Embedding

arXiv.org Artificial Intelligence

Distinguished from traditional knowledge graphs (KGs), temporal knowledge graphs (TKGs) must explore and reason over temporally evolving facts adequately. However, existing TKG approaches still face two main challenges, i.e., the limited capability to model arbitrary timestamps continuously and the lack of rich inference patterns under temporal constraints. In this paper, we propose an innovative TKGE method (PTBox) via polynomial decomposition-based temporal representation and box embedding-based entity representation to tackle the above-mentioned problems. Specifically, we decompose time information by polynomials and then enhance the model's capability to represent arbitrary timestamps flexibly by incorporating the learnable temporal basis tensor. In addition, we model every entity as a hyperrectangle box and define each relation as a transformation on the head and tail entity boxes. The entity boxes can capture complex geometric structures and learn robust representations, improving the model's inductive capability for rich inference patterns. Theoretically, our PTBox can encode arbitrary time information or even unseen timestamps while capturing rich inference patterns and higher-arity relations of the knowledge base. Extensive experiments on real-world datasets demonstrate the effectiveness of our method.


A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications

arXiv.org Artificial Intelligence

External knowledge graphs (KGs) can be used to augment large language models (LLMs), while simultaneously providing an explainable knowledge base of facts that can be inspected by a human. This approach may be particularly valuable in domains where explainability is critical, like human trafficking data analysis. However, creating KGs can pose challenges. KGs parsed from documents may comprise explicit connections (those directly stated by a document) but miss implicit connections (those obvious to a human although not directly stated). To address these challenges, this preliminary research introduces the GAME-KG framework, standing for "Gaming for Augmenting Metadata and Enhancing Knowledge Graphs." GAME-KG is a federated approach to modifying explicit as well as implicit connections in KGs by using crowdsourced feedback collected through video games. GAME-KG is shown through two demonstrations: a Unity test scenario from Dark Shadows, a video game that collects feedback on KGs parsed from US Department of Justice (DOJ) Press Releases on human trafficking, and a following experiment where OpenAI's GPT-4 is prompted to answer questions based on a modified and unmodified KG. Initial results suggest that GAME-KG can be an effective framework for enhancing KGs, while simultaneously providing an explainable set of structured facts verified by humans.


History repeats Itself: A Baseline for Temporal Knowledge Graph Forecasting

arXiv.org Artificial Intelligence

Temporal Knowledge Graph (TKG) Forecasting aims at predicting links in Knowledge Graphs for future timesteps based on a history of Knowledge Graphs. To this day, standardized evaluation protocols and rigorous comparison across TKG models are available, but the importance of simple baselines is often neglected in the evaluation, which prevents researchers from discerning actual and fictitious progress. We propose to close this gap by designing an intuitive baseline for TKG Forecasting based on predicting recurring facts. Compared to most TKG models, it requires little hyperparameter tuning and no iterative training. Further, it can help to identify failure modes in existing approaches. The empirical findings are quite unexpected: compared to 11 methods on five datasets, our baseline ranks first or third in three of them, painting a radically different picture of the predictive quality of the state of the art.