Goto

Collaborating Authors

 Country


Atmospheric turbulence removal using convolutional neural network

arXiv.org Machine Learning

This paper describes a novel deep learning-based method for mitigating the effects of atmospheric distortion. We have built an end-to-end supervised convolutional neural network (CNN) to reconstruct turbulence-corrupted video sequence. Our framework has been developed on the residual learning concept, where the spatio-temporal distortions are learnt and predicted. Our experiments demonstrate that the proposed method can deblur, remove ripple effect and enhance contrast of the video sequences simultaneously. Our model was trained and tested with both simulated and real distortions. Experimental results of the real distortions show that our method outperforms the existing ones by up to 3.8% in term of the quality of restored images, and it achieves faster speed than the state-of-the-art methods by up to 23 times with GPU implementation.


Business Process Variant Analysis based on Mutual Fingerprints of Event Logs

arXiv.org Machine Learning

Comparing business process variants using event logs is a common use case in process mining. Existing techniques for process variant analysis detect statistically-significant differences between variants at the level of individual entities (such as process activities) and their relationships (e.g. directly-follows relations between activities). This may lead to a proliferation of differences due to the low level of granularity in which such differences are captured. This paper presents a novel approach to detect statistically-significant differences between variants at the level of entire process traces (i.e. sequences of directly-follows relations). The cornerstone of this approach is a technique to learn a directly follows graph called mutual fingerprint from the event logs of the two variants. A mutual fingerprint is a lossless encoding of a set of traces and their duration using discrete wavelet transformation. This structure facilitates the understanding of statistical differences along the control-flow and performance dimensions. The approach has been evaluated using real-life event logs against two baselines. The results show that at a trace level, the baselines cannot always reveal the differences discovered by our approach, or can detect spurious differences.


The Labeling Distribution Matrix (LDM): A Tool for Estimating Machine Learning Algorithm Capacity

arXiv.org Machine Learning

Keywords: Machine Learning, Model Complexity, Algorithm Capacity, VC Dimension, Label Autoencoder Abstract: Algorithm performance in supervised learning is a combination of memorization, generalization, and luck. By estimating how much information an algorithm can memorize from a dataset, we can set a lower bound on the amount of performance due to other factors such as generalization and luck. With this goal in mind, we introduce the Labeling Distribution Matrix (LDM) as a tool for estimating the capacity of learning algorithms. The method attempts to characterize the diversity of possible outputs by an algorithm for different training datasets, using this to measure algorithm flexibility and responsiveness to data. We test the method on several supervised learning algorithms, and find that while the results are not conclusive, the LDM does allow us to gain potentially valuable insight into the prediction behavior of algorithms. We also introduce the Label Autoencoder as an additional tool for estimating algorithm capacity, with more promising initial results. 1 INTRODUCTION Determining the representational complexity of a learning algorithm is a longstanding problem in machine learning.


Efficient Parameter Sampling for Neural Network Construction

arXiv.org Machine Learning

The customizable nature of deep learning models have allowed them to be successful predictors in various disciplines. These models are often trained with respect to thousands or millions of instances for complicated problems, but the gathering of such an immense collection may be infeasible and expensive. However, what often occurs is the pollution of redundant information from these instances to the deep learning models. This paper outlines an algorithm that dynamically selects and appends instances to a training dataset from uncertain regions of the parameter space based on differences in predictions from multiple convolutional neural networks (CNNs). These CNNs are also simultaneously trained on this growing dataset to construct more accurate and knowledgable models. The methodology presented has reduced training dataset sizes by almost 90% and maintained predictive power in two diagnostics of high energy density physics.


Interpreting Predictive Process Monitoring Benchmarks

arXiv.org Machine Learning

Predictive process analytics has recently gained significant attention, and yet its successful adoption in organisations relies on how well users can trust the predictions of the underlying machine learning algorithms that are often applied and recognised as a `black-box'. Without understanding the rationale of the black-box machinery, there will be a lack of trust in the predictions, a reluctance to use the predictions, and in the worse case, consequences of an incorrect decision based on the prediction. In this paper, we emphasise the importance of interpreting the predictive models in addition to the evaluation using conventional metrics, such as accuracy, in the context of predictive process monitoring. We review existing studies on business process monitoring benchmarks for predicting process outcomes and remaining time. We derive explanations that present the behaviour of the entire predictive model as well as explanations describing a particular prediction. These explanations are used to reveal data leakages, assess the interpretability of features used by the model, and the degree of the use of process knowledge in the existing benchmark models. Findings from this exploratory study motivate the need to incorporate interpretability in predictive process analytics.


Hierarchical Target-Attentive Diagnosis Prediction in Heterogeneous Information Networks

arXiv.org Machine Learning

--We introduce HT AD, a novel model for diagnosis prediction using Electronic Health Records (EHR) represented as Heterogeneous Information Networks. Recent studies on modeling EHR have shown success in automatically learning representations of the clinical records in order to avoid the need for manual feature selection. However, these representations are often learned and aggregated without specificity for the different possible targets being predicted. Our model introduces a target-aware hierarchical attention mechanism that allows it to learn to attend to the most important clinical records when aggregating their representations for prediction of a diagnosis. We evaluate our model using a publicly available benchmark dataset and demonstrate that the use of target-aware attention significantly improves performance compared to the current state of the art. Additionally, we propose a method for incorporating non-categorical data into our predictions and demonstrate that this technique leads to further performance improvements. Lastly, we demonstrate that the predictions made by our proposed model are easily interpretable. I NTRODUCTION Electronic Health Records (EHR) provide a comprehensive picture of patients' medical histories, consisting of information such as written clinician notes, medical imagery, prescriptions, and diagnoses.


Counterfactual Evaluation of Treatment Assignment Functions with Networked Observational Data

arXiv.org Machine Learning

Counterfactual evaluation of novel treatment assignment functions (e.g., advertising algorithms and recommender systems) is one of the most crucial causal inference problems for practitioners. Traditionally, randomized controlled trials (A/B tests) are performed to evaluate treatment assignment functions. However, such trials can be time-consuming, expensive, and even unethical in some cases. Therefore, offline counterfactual evaluation of treatment assignment functions becomes a pressing issue because a massive amount of observational data is available in today's big data era. Counterfactual evaluation requires handling the hidden confounders -- the unmeasured features which causally influence both the treatment assignment and the outcome. To deal with the hidden confounders, most of the existing methods rely on the assumption of no hidden confounders. However, this assumption can be untenable in the context of massive observational data. When such data comes with network information, the later can be potentially useful to correct hidden confounding bias. As such, we first formulate a novel problem, counterfactual evaluation of treatment assignment functions with networked observational data. Then, we investigate the following research questions: How can we utilize network information in counterfactual evaluation? Can network information improve the estimates in counterfactual evaluation? Toward answering these questions, first, we propose a novel framework, \emph{Counterfactual Network Evaluator} (CONE), which (1) learns partial representations of latent confounders under the supervision of observed treatments and outcomes; and (2) combines them for counterfactual evaluation. Then through extensive experiments, we corroborate the effectiveness of CONE. The results imply that incorporating network information mitigates hidden confounding bias in counterfactual evaluation.


Rapid Whole-Heart CMR with Single Volume Super-resolution

arXiv.org Machine Learning

Background: Three-dimensional, whole heart, balanced steady state free precession (WH-bSSFP) sequences provide delineation of intra-cardiac and vascular anatomy. However, they have long acquisition times. Here, we propose significant speed ups using a deep learning single volume super resolution reconstruction, to recover high resolution features from rapidly acquired low resolution WH-bSSFP images. Methods: A 3D residual U-Net was trained using synthetic data, created from a library of high-resolution WH-bSSFP images by simulating 0.5 slice resolution and 0.5 phase resolution. The trained network was validated with synthetic test data, as well as prospective low-resolution data. Results: Synthetic low-resolution data had significantly better image quality after super-resolution reconstruction. Qualitative image scores showed super-resolved images had better edge sharpness, fewer residual artefacts and less image distortion than low-resolution images, with similar scores to high-resolution data. Quantitative image scores showed super-resolved images had significantly better edge sharpness than low-resolution or high-resolution images, with significantly better signal-to-noise ratio than high-resolution data. Vessel diameters measurements showed over-estimation in the low-resolution measurements, compared to the high-resolution data. No significant differences and no bias was found in the super-resolution measurements. Conclusion: This paper demonstrates the potential of using a residual U-Net for super-resolution reconstruction of rapidly acquired low-resolution whole heart bSSFP data within a clinical setting. The resulting network can be applied very quickly, making these techniques particularly appealing within busy clinical workflow. Thus, we believe that this technique may help speed up whole heart CMR in clinical practice.


A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy Tasks

arXiv.org Machine Learning

Evaluation of Bayesian deep learning (BDL) methods is challenging. We often seek to evaluate the methods' robustness and scalability, assessing whether new tools give `better' uncertainty estimates than old ones. These evaluations are paramount for practitioners when choosing BDL tools on-top of which they build their applications. Current popular evaluations of BDL methods, such as the UCI experiments, are lacking: Methods that excel with these experiments often fail when used in application such as medical or automotive, suggesting a pertinent need for new benchmarks in the field. We propose a new BDL benchmark with a diverse set of tasks, inspired by a real-world medical imaging application on \emph{diabetic retinopathy diagnosis}. Visual inputs (512x512 RGB images of retinas) are considered, where model uncertainty is used for medical pre-screening---i.e. to refer patients to an expert when model diagnosis is uncertain. Methods are then ranked according to metrics derived from expert-domain to reflect real-world use of model uncertainty in automated diagnosis. We develop multiple tasks that fall under this application, including out-of-distribution detection and robustness to distribution shift. We then perform a systematic comparison of well-tuned BDL techniques on the various tasks. From our comparison we conclude that some current techniques which solve benchmarks such as UCI `overfit' their uncertainty to the dataset---when evaluated on our benchmark these underperform in comparison to simpler baselines. The code for the benchmark, its baselines, and a simple API for evaluating new BDL tools are made available at https://github.com/oatml/bdl-benchmarks.


A Regression Framework for Predicting User's Next Location using Call Detail Records

arXiv.org Machine Learning

With the growth of using cell phones and the increase in diversity of smart mobile devices, a massive volume of data is generated continuously in the process of using these devices. Among these data, Call Detail Records, CDR, is highly remarkable. Since CDR contains both temporal and spatial labels, mobility analysis of CDR is one of the favorite subjects of study among the researchers. The user next location prediction is one of the main problems in the field of human mobility analysis. In this paper, we propose a data processing framework to predict user next location. We propose domain-specific data processing strategies and design a deep neural network model which is based on recurrent neurons and perform regression tasks. Using this prediction framework, the error of the prediction decreases from 74% to 55% in comparison to the worst and best performing traditional models. Methods, strategies, the framework and the results of this paper can be helpful in many applications such as urban planning and digital marketing.