Collaborating Authors

The KEEN Universe: An Ecosystem for Knowledge Graph Embeddings with a Focus on Reproducibility and Transferability Artificial Intelligence

There is an emerging trend of embedding knowledge graphs (KGs) in continuous vector spaces in order to use those for machine learning tasks. Recently, many knowledge graph embedding (KGE) models have been proposed that learn low dimensional representations while trying to maintain the structural properties of the KGs such as the similarity of nodes depending on their edges to other nodes. KGEs can be used to address tasks within KGs such as the prediction of novel links and the disambiguation of entities. They can also be used for downstream tasks like question answering and fact-checking. Overall, these tasks are relevant for the semantic web community. Despite their popularity, the reproducibility of KGE experiments and the transferability of proposed KGE models to research fields outside the machine learning community can be a major challenge. Therefore, we present the KEEN Universe, an ecosystem for knowledge graph embeddings that we have developed with a strong focus on reproducibility and transferability. The KEEN Universe currently consists of the Python packages PyKEEN (Python KnowlEdge EmbeddiNgs), BioKEEN (Biological KnowlEdge EmbeddiNgs), and the KEEN Model Zoo for sharing trained KGE models with the community.

PyKEEN 1.0: A Python Library for Training and Evaluating Knowledge Graph Embeddings Artificial Intelligence

Recently, knowledge graph embeddings (KGEs) received significant attention, and several software libraries have been developed for training and evaluating KGEs. While each of them addresses specific needs, we re-designed and re-implemented PyKEEN, one of the first KGE libraries, in a community effort. PyKEEN 1.0 enables users to compose knowledge graph embedding models (KGEMs) based on a wide range of interaction models, training approaches, loss functions, and permits the explicit modeling of inverse relations. Besides, an automatic memory optimization has been realized in order to exploit the provided hardware optimally, and through the integration of Optuna extensive hyper-parameter optimization (HPO) functionalities are provided.

OpenBioLink: A resource and benchmarking framework for large-scale biomedical link prediction Artificial Intelligence

Summary: Recently, novel machine-learning algorithms have shown potential for predicting undiscovered links in biomedical knowledge networks. However, dedicated benchmarks for measuring algorithmic progress have not yet emerged. With OpenBioLink, we introduce a large-scale, high-quality and highly challenging biomedical link prediction benchmark to transparently and reproducibly evaluate such algorithms.



PyKEEN (Python KnowlEdge EmbeddiNgs) is a Python package designed to train and evaluate knowledge graph embedding models (incorporating multi-modal information). It is part of the KEEN Universe. The development version of PyKEEN can be downloaded and installed from PyPI on Python 3.7 with: The development version of PyKEEN can be downloaded and installed from GitHub on Python 3.7 with: They can be included with installation using the bracket notation like in pip install pykeen[docs] or pip install -e .[docs]. Several can be listed, comma-delimited like in pip install pykeen[docs,plotting]. Contributions, whether filing an issue, making a pull request, or forking, are appreciated.



PyKEEN (Python KnowlEdge EmbeddiNgs) is a package for training and evaluating knowledge graph embeddings. Currently, it provides implementations of 10 knowledge graph emebddings models, and can be run in training mode in which users provide their own set of hyper-parameter values, or in hyper-parameter optimization mode to find suitable hyper-parameter values from set of user defined values. PyKEEN can also be run without having experience in programing by using its interactive command line interface that can be started with the command pykeen from a terminal. We are currently working on PyKEEN 1.0 which will provide additional features such as several negative sampling approaches and further evaluation metrics. Furthermore, we are integrating additional KGE models.