AITopics | rdf2vec

Collaborating Authors

rdf2vec

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

gpuRDF2vec -- Scalable GPU-based RDF2vec

Böckling, Martin, Paulheim, Heiko

arXiv.org Artificial IntelligenceAug-5-2025

Generating Knowledge Graph (KG) embeddings at web scale remains challenging. Among existing techniques, RDF2vec combines effectiveness with strong scalability. We present gpuRDF2vec, an open source library that harnesses modern GPUs and supports multi-node execution to accelerate every stage of the RDF2vec pipeline. Extensive experiments on both synthetically generated graphs and real-world benchmarks show that gpuRDF2vec achieves up to a substantial speedup over the currently fastest alternative, i.e., jRDF2vec. In a single-node setup, our walk-extraction phase alone outperforms pyRDF2vec, SparkKGML, and jRDF2vec by a substantial margin using random walks on large/ dense graphs, and scales very well to longer walks, which typically lead to better quality embeddings. Our implementation of gpuRDF2vec enables practitioners and researchers to train high-quality KG embeddings on large-scale graphs within practical time budgets and builds on top of Pytorch Lightning for the scalable word2vec implementation.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.01073

Country: Europe (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

RDF-star2Vec: RDF-star Graph Embeddings for Data Mining

Egami, Shusaku, Ugai, Takanori, Oota, Masateru, Matsushita, Kyoumoto, Kawamura, Takahiro, Kozaki, Kouji, Fukuda, Ken

arXiv.org Artificial IntelligenceDec-25-2023

Knowledge Graphs (KGs) such as Resource Description Framework (RDF) data represent relationships between various entities through the structure of triples (). Knowledge graph embedding (KGE) is crucial in machine learning applications, specifically in node classification and link prediction tasks. KGE remains a vital research topic within the semantic web community. RDF-star introduces the concept of a quoted triple (QT), a specific form of triple employed either as the subject or object within another triple. Moreover, RDF-star permits a QT to act as compositional entities within another QT, thereby enabling the representation of recursive, hyper-relational KGs with nested structures. However, existing KGE models fail to adequately learn the semantics of QTs and entities, primarily because they do not account for RDF-star graphs containing multi-leveled nested QTs and QT-QT relationships. This study introduces RDF-star2Vec, a novel KGE model specifically designed for RDF-star graphs. RDF-star2Vec introduces graph walk techniques that enable probabilistic transitions between a QT and its compositional entities. Feature vectors for QTs, entities, and relations are derived from generated sequences through the structured skip-gram model. Additionally, we provide a dataset and a benchmarking framework for data mining tasks focused on complex RDF-star graphs. Evaluative experiments demonstrated that RDF-star2Vec yielded superior performance compared to recent extensions of RDF2Vec in various tasks including classification, clustering, entity relatedness, and QT similarity.

dataset, graph, rdf-star2vec, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ACCESS.2023.3341029

2312.15626

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)
Europe > Switzerland (0.05)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.05)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

Putting RDF2vec in Order

Portisch, Jan, Paulheim, Heiko

arXiv.org Artificial IntelligenceAug-11-2021

The RDF2vec method for creating node embeddings on knowledge graphs is based on word2vec, which, in turn, is agnostic towards the position of context words. In this paper, we argue that this might be a shortcoming when training RDF2vec, and show that using a word2vec variant which respects order yields considerable performance gains especially on tasks where entities of different classes are involved.

knowledge graph, rdf2vec, representation, (15 more...)

arXiv.org Artificial Intelligence

2108.0528

Country: Europe > Germany > Hamburg (0.04)

Genre: Research Report (0.50)

Industry: Government (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.42)

Add feedback

Walk Extraction Strategies for Node Embeddings with RDF2Vec in Knowledge Graphs

Vandewiele, Gilles, Steenwinckel, Bram, Bonte, Pieter, Weyns, Michael, Paulheim, Heiko, Ristoski, Petar, De Turck, Filip, Ongenae, Femke

arXiv.org Artificial IntelligenceSep-9-2020

As KGs are symbolic constructs, specialized techniques have to be applied in order to make them compatible with data mining techniques. RDF2Vec is an unsupervised technique that can create task-agnostic numerical representations of the nodes in a KG by extending successful language modelling techniques. The original work proposed the Weisfeiler-Lehman (WL) kernel to improve the quality of the representations. However, in this work, we show both formally and empirically that the WL kernel does little to improve walk embeddings in the context of a single KG. As an alternative to the WL kernel, we propose five different strategies to extract information complementary to basic random walks. We compare these walks on several benchmark datasets to show that the \emph{n-gram} strategy performs best on average on node classification tasks and that tuning the walk strategy can result in improved predictive performances.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2009.04404

Country:

Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
North America > United States (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

More is not Always Better: The Negative Impact of A-box Materialization on RDF2vec Knowledge Graph Embeddings

Iana, Andreea, Paulheim, Heiko

arXiv.org Artificial IntelligenceSep-1-2020

RDF2vec is an embedding technique for representing knowledge graph entities in a continuous vector space. In this paper, we investigate the effect of materializing implicit A-box axioms induced by subproperties, as well as symmetric and transitive properties. While it might be a reasonable assumption that such a materialization before computing embeddings might lead to better embeddings, we conduct a set of experiments on DBpedia which demonstrate that the materialization actually has a negative effect on the performance of RDF2vec. In our analysis, we argue that despite the huge body of work devoted on completing missing information in knowledge graphs, such missing implicit information is actually a signal, not a defect, and we show examples illustrating that assumption.

artificial intelligence, graph, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2009.00318

Country:

Europe > Ireland > Connaught > County Galway > Galway (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.64)

Industry:

Media (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

Towards Exploiting Implicit Human Feedback for Improving RDF2vec Embeddings

Taweel, Ahmad Al, Paulheim, Heiko

arXiv.org Artificial IntelligenceApr-9-2020

RDF2vec is a technique for creating vector space embeddings from an RDF knowledge graph, i.e., representing each entity in the graph as a vector. It first creates sequences of nodes by performing random walks on the graph. In a second step, those sequences are processed by the word2vec algorithm for creating the actual embeddings. In this paper, we explore the use of external edge weights for guiding the random walks. As edge weights, transition probabilities between pages in Wikipedia are used as a proxy for the human feedback for the importance of an edge. We show that in some scenarios, RDF2vec utilizing those transition probabilities can outperform both RDF2vec based on random walks as well as the usage of graph internal edge weights.

graph, knowledge graph, rdf2vec, (15 more...)

arXiv.org Artificial Intelligence

2004.04423

Country:

North America > Canada > Ontario > Middlesex County > London (0.04)
Europe > United Kingdom > England (0.04)
Europe > Germany (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.35)

Add feedback

Concept2vec: Metrics for Evaluating Quality of Embeddings for Ontological Concepts

Alshargi, Faisal, Shekarpour, Saeedeh, Soru, Tommaso, Sheth, Amit

arXiv.org Artificial IntelligenceJul-26-2018

Although there is an emerging trend towards generating embeddings for primarily unstructured data, and recently for structured data, there is not yet any systematic suite for measuring the quality of embeddings. This deficiency is further sensed with respect to embeddings generated for structured data because there are no concrete evaluation metrics measuring the quality of encoded structure as well as semantic patterns in the embedding space. In this paper, we introduce a framework containing three distinct tasks concerned with the individual aspects of ontological concepts: (i) the categorization aspect, (ii) the hierarchical aspect, and (iii) the relational aspect. Then, in the scope of each task, a number of intrinsic metrics are proposed for evaluating the quality of the embeddings. Furthermore, w.r.t. this framework multiple experimental studies were run to compare the quality of the available embedding models. Employing this framework in future research can reduce misjudgment and provide greater insight about quality comparisons of embeddings for ontological concepts.

data mining, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

1803.04488

Country:

Europe > Germany > Saxony > Leipzig (0.04)
North America > United States > Nevada (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(3 more...)

Add feedback