Goto

Collaborating Authors

 ipynb notebook


Make Literature-Based Discovery Great Again through Reproducible Pipelines

Cestnik, Bojan, Kastrin, Andrej, Koloski, Boshko, Lavrač, Nada

arXiv.org Artificial Intelligence

By connecting disparate sources of scientific literature, literature\-/based discovery (LBD) methods help to uncover new knowledge and generate new research hypotheses that cannot be found from domain-specific documents alone. Our work focuses on bisociative LBD methods that combine bisociative reasoning with LBD techniques. The paper presents LBD through the lens of reproducible science to ensure the reproducibility of LBD experiments, overcome the inconsistent use of benchmark datasets and methods, trigger collaboration, and advance the LBD field toward more robust and impactful scientific discoveries. The main novelty of this study is a collection of Jupyter Notebooks that illustrate the steps of the bisociative LBD process, including data acquisition, text preprocessing, hypothesis formulation, and evaluation. The contributed notebooks implement a selection of traditional LBD approaches, as well as our own ensemble-based, outlier-based, and link prediction-based approaches. The reader can benefit from hands-on experience with LBD through open access to benchmark datasets, code reuse, and a ready-to-run Docker recipe that ensures reproducibility of the selected LBD methods.


Train and deploy a FairMOT model with Amazon SageMaker

#artificialintelligence

Multi-object tracking (MOT) in video analysis is increasingly in demand in many industries, such as live sports, manufacturing, surveillance, and traffic monitoring. For example, in live sports, MOT can track soccer players in real time to analyze physical performance such as real-time speed and moving distance. Previously, most methods were designed to separate MOT into two tasks: object detection and association. The object detection task detects objects first. The association task extracts re-identification (re-ID) features from image regions for each detected object, and links each detected object through re-ID features to existing tracks or creates a new track.


How to create a real-time Face Detector

#artificialintelligence

In this article, I will show you how to write a real-time face detector using Python, TensorFlow/Keras and OpenCV. All code is available in this repo. You can also read this tutorial directly on GitLab. Python code is highlighted there, so it is more convenient to read. First, in Theoretical Part I will tell you a little about the concepts that will be useful for us (Transfer Learning and Data Augmentation), and then I will go to the code analysis in the Practical Part section. Note, that you must have tensorflow and opencv libraries installed to run this code.


Customizing the SentenceDetector in Spark NLP

#artificialintelligence

There are many Natural Language Processing (NLP) tasks that require text to be split in chunks of varying granularity: 1. Document 2. Sentence 3. Token 4. etc… This post is focused on splitting text into sentences in order to facilitate later downstream tasks, such as, Named Entity Recognition (NER), Text Classification or Sentiment Analysis. Splitting a sentence correctly can be crucial for the success of the downstream task as we can see in the following example. Suppose we (wrongly) split a German legal reference like: "Schütze ZPO 4. Aufl. Now you might say this is special subject stuff and there are always exotic cases. But this issue also occurs in daily life when you want to extract common things. Consider, for example, (an invented) German address (with correct syntax for zip code and so forth): "Dr.


WeightWatcher: Empirical Quality Metrics for Deep Neural Networks

#artificialintelligence

We introduce the weightwatcher (ww), a python tool for a python tool for computing quality metrics of trained, and pretrained, Deep Neural Netwworks. This blog describes how to use the tool in practice; see our most recent paper for even more details. The summary contains the Power Law exponent (), as well as several log norm metrics, as explained in our papers, and below. Each value represents an empirical quality metric that can be used to gauge the gross effectiveness of the model, as compared to similar models. We can use these metrics to compare models across a common architecture series, such as the VGG series, the ResNet series, etc. These can be applied to trained models, pretrained models, and/or even fine-tuned models.


Implementing Object Detection and Instance Segmentation for Data Scientists

#artificialintelligence

Object Detection is a helpful tool to have in your coding repository. It forms the backbone of many fantastic industrial applications. In my last post on Object detection, I talked about how Object detection models evolved. But what good is theory, if we can't implement it? This post is about implementing and getting an object detector on our custom dataset of weapons.