Goto

Collaborating Authors

 Inductive Learning


Online Disease Self-diagnosis with Inductive Heterogeneous Graph Convolutional Networks

arXiv.org Artificial Intelligence

We propose a Healthcare Graph Convolutional Network (HealGCN) to offer disease self-diagnosis service for online users, based on the Electronic Healthcare Records (EHRs). Two main challenges are focused in this paper for online disease self-diagnosis: (1) serving cold-start users via graph convolutional networks and (2) handling scarce clinical description via a symptom retrieval system. To this end, we first organize the EHR data into a heterogeneous graph that is capable of modeling complex interactions among users, symptoms and diseases, and tailor the graph representation learning towards disease diagnosis with an inductive learning paradigm. Then, we build a disease self-diagnosis system with a corresponding EHR Graph-based Symptom Retrieval System (GraphRet) that can search and provide a list of relevant alternative symptoms by tracing the predefined meta-paths. GraphRet helps enrich the seed symptom set through the EHR graph, resulting in better reasoning ability of our HealGCN model, when confronting users with scarce descriptions. At last, we validate our model on a large-scale EHR dataset, the superior performance does confirm our model's effectiveness in practice.


The Illustrated SimCLR Framework

#artificialintelligence

In recent years, numerous self-supervised learning methods have been proposed for learning image representations, each getting better than the previous. But, their performance was still below the supervised counterparts. This changed when Chen et. The SimCLR paper not only improves upon the previous state-of-the-art self-supervised learning methods but also beats the supervised learning method on ImageNet classification when scaling up the architecture. In this article, I will explain the key ideas of the framework proposed in the research paper using diagrams.


Generalized vec trick for fast learning of pairwise kernel models

arXiv.org Machine Learning

Pairwise learning corresponds to the supervised learning setting where the goal is to make predictions for pairs of objects. Prominent applications include predicting drug-target or protein-protein interactions, or customer-product preferences. Several kernel functions have been proposed for incorporating prior knowledge about the relationship between the objects, when training kernel based learning methods. However, the number of training pairs n is often very large, making O(n^2) cost of constructing the pairwise kernel matrix infeasible. If each training pair x= (d,t) consists of drug d and target t, let m and q denote the number of unique drugs and targets appearing in the training pairs. In many real-world applications m,q << n, which can be used to develop computational shortcuts. Recently, a O(nm+nq) time algorithm we refer to as the generalized vec trick was introduced for training kernel methods with the Kronecker kernel. In this work, we show that a large class of pairwise kernels can be expressed as a sum of product matrices, which generalizes the result to the most commonly used pairwise kernels. This includes symmetric and anti-symmetric, metric-learning, Cartesian, ranking, as well as linear, polynomial and Gaussian kernels. In the experiments, we demonstrate how the introduced approach allows scaling pairwise kernels to much larger data sets than previously feasible, and compare the kernels on a number of biological interaction prediction tasks.


Learning Adaptive Embedding Considering Incremental Class

arXiv.org Artificial Intelligence

Class-Incremental Learning (CIL) aims to train a reliable model with the streaming data, which emerges unknown classes sequentially. Different from traditional closed set learning, CIL has two main challenges: 1) Novel class detection. The initial training data only contains incomplete classes, and streaming test data will accept unknown classes. Therefore, the model needs to not only accurately classify known classes, but also effectively detect unknown classes; 2) Model expansion. After the novel classes are detected, the model needs to be updated without re-training using entire previous data. However, traditional CIL methods have not fully considered these two challenges, first, they are always restricted to single novel class detection each phase and embedding confusion caused by unknown classes. Besides, they also ignore the catastrophic forgetting of known categories in model update. To this end, we propose a Class-Incremental Learning without Forgetting (CILF) framework, which aims to learn adaptive embedding for processing novel class detection and model update in a unified framework. In detail, CILF designs to regularize classification with decoupled prototype based loss, which can improve the intra-class and inter-class structure significantly, and acquire a compact embedding representation for novel class detection in result. Then, CILF employs a learnable curriculum clustering operator to estimate the number of semantic clusters via fine-tuning the learned network, in which curriculum operator can adaptively learn the embedding in self-taught form. Therefore, CILF can detect multiple novel classes and mitigate the embedding confusion problem. Last, with the labeled streaming test data, CILF can update the network with robust regularization to mitigate the catastrophic forgetting. Consequently, CILF is able to iteratively perform novel class detection and model update.


Machine Learning For Absolute Beginners

#artificialintelligence

Thus, let's talk about the types of machine learning algorithms. Supervised learning as the name indicates the presence of a supervisor as a teacher. Basically supervised learning is a learning in which we teach or train the machine using data that is well labeled which means some data is already tagged with the correct answer. After that, the machine is provided with a new set of examples (data) so that the supervised learning algorithm analyses the training data (set of training examples) and produces a correct outcome from labeled data. Unsupervised learning is the training of machines using information that is neither classified nor labeled and allowing the algorithm to act on that information without guidance. Here the task of the machine is to group unsorted information according to similarities, patterns, and differences without any prior training of data.


Unifying supervised learning and VAEs -- automating statistical inference in high-energy physics

arXiv.org Machine Learning

A KL-divergence objective of the joint distribution of data and labels allows to unify supervised learning, variational autoencoders (VAEs) and semi-supervised learning under one umbrella of variational inference. This viewpoint has several advantages. For VAEs, it clarifies the interpretation of encoder and decoder parts. For supervised learning, it re-iterates that the training procedure approximates the true posterior over labels and can always be viewed as approximate likelihood-free inference. This is typically not discussed, even though the derivation is well-known in the literature. In the context of semi-supervised learning it motivates an extended supervised scheme which allows to calculate a goodness-of-fit p-value using posterior predictive simulations. Flow-based networks with a standard normal base distribution are crucial. We discuss how they allow to rigorously define coverage for arbitrary joint posteriors on $\mathbb{R}^n \times \mathcal{S}^m$, which encompasses posteriors over directions. Finally, systematic uncertainties are naturally included in the variational viewpoint. With the three ingredients of (1) systematics, (2) coverage and (3) goodness-of-fit, flow-based neural networks have the potential to replace a large part of the statistical toolbox of the contemporary high-energy physicist.


Teaching a Machine to Diagnose a Heart Disease; Beginning from digitizing scanned ECGs to detecting the Brugada Syndrome (BrS)

arXiv.org Artificial Intelligence

Medical diagnoses can shape and change the life of a person drastically. Therefore, it is always best advised to collect as much evidence as possible to be certain about the diagnosis. Unfortunately, in the case of the Brugada Syndrome (BrS), a rare and inherited heart disease, only one diagnostic criterion exists, namely, a typical pattern in the Electrocardiogram (ECG). In the following treatise, we question whether the investigation of ECG strips by the means of machine learning methods improves the detection of BrS positive cases and hence, the diagnostic process. We propose a pipeline that reads in scanned images of ECGs, and transforms the encaptured signals to digital time-voltage data after several processing steps. Then, we present a long short-term memory (LSTM) classifier that is built based on the previously extracted data and that makes the diagnosis. The proposed pipeline distinguishes between three major types of ECG images and recreates each recorded lead signal. Features and quality are retained during the digitization of the data, albeit some encountered issues are not fully removed (Part I). Nevertheless, the results of the aforesaid program are suitable for further investigation of the ECG by a computational method such as the proposed classifier which proves the concept and could be the architectural basis for future research (Part II). This thesis is divided into two parts as they are part of the same process but conceptually different. It is hoped that this work builds a new foundation for computational investigations in the case of the BrS and its diagnosis.


Semi-supervised Learning with the EM Algorithm: A Comparative Study between Unstructured and Structured Prediction

arXiv.org Machine Learning

Semi-supervised learning aims to learn prediction models from both labeled and unlabeled samples. There has been extensive research in this area. Among existing work, generative mixture models with Expectation-Maximization (EM) is a popular method due to clear statistical properties. However, existing literature on EM-based semi-supervised learning largely focuses on unstructured prediction, assuming that samples are independent and identically distributed. Studies on EM-based semi-supervised approach in structured prediction is limited. This paper aims to fill the gap through a comparative study between unstructured and structured methods in EM-based semi-supervised learning. Specifically, we compare their theoretical properties and find that both methods can be considered as a generalization of self-training with soft class assignment of unlabeled samples, but the structured method additionally considers structural constraint in soft class assignment. We conducted a case study on real-world flood mapping datasets to compare the two methods. Results show that structured EM is more robust to class confusion caused by noise and obstacles in features in the context of the flood mapping application.


Accelerating Federated Learning in Heterogeneous Data and Computational Environments

arXiv.org Artificial Intelligence

There are situations where data relevant to a machine learning problem are distributed among multiple locations that cannot share the data due to regulatory, competitiveness, or privacy reasons. For example, data present in users' cellphones, manufacturing data of companies in a given industrial sector, or medical records located at different hospitals. Moreover, participating sites often have different data distributions and computational capabilities. Federated Learning provides an approach to learn a joint model over all the available data in these environments. In this paper, we introduce a novel distributed validation weighting scheme (DVW), which evaluates the performance of a learner in the federation against a distributed validation set. Each learner reserves a small portion (e.g., 5%) of its local training examples as a validation dataset and allows other learners models to be evaluated against it. We empirically show that DVW results in better performance compared to established methods, such as FedAvg, both under synchronous and asynchronous communication protocols in data and computationally heterogeneous environments.


InstanceFlow: Visualizing the Evolution of Classifier Confusion on the Instance Level

arXiv.org Machine Learning

Classification is one of the most important supervised machine learning tasks. During the training of a classification model, the training instances are fed to the model multiple times (during multiple epochs) in order to iteratively increase the classification performance. The increasing complexity of models has led to a growing demand for model interpretability through visualizations. Existing approaches mostly focus on the visual analysis of the final model performance after training and are often limited to aggregate performance measures. In this paper we introduce InstanceFlow, a novel dual-view visualization tool that allows users to analyze the learning behavior of classifiers over time on the instance-level. A Sankey diagram visualizes the flow of instances throughout epochs, with on-demand detailed glyphs and traces for individual instances. A tabular view allows users to locate interesting instances by ranking and filtering. In this way, InstanceFlow bridges the gap between class-level and instance-level performance evaluation while enabling users to perform a full temporal analysis of the training process.