Goto

Collaborating Authors

 Performance Analysis


Visual Analytics and Human Involvement in Machine Learning

arXiv.org Machine Learning

The rapidly developing AI systems and applications still require human involvement in practically all parts of the analytics process. Human decisions are largely based on visualizations, providing data scientists details of data properties and the results of analytical procedures. Different visualizations are used in the different steps of the Machine Learning (ML) process. The decision which visualization to use depends on factors, such as the data domain, the data model and the step in the ML process. In this chapter, we describe the seven steps in the ML process and review different visualization techniques that are relevant for the different steps for different types of data, models and purposes.


Learning to Estimate Driver Drowsiness from Car Acceleration Sensors using Weakly Labeled Data

arXiv.org Machine Learning

This paper addresses the learning task of estimating driver drowsiness from the signals of car acceleration sensors. Since even drivers themselves cannot perceive their own drowsiness in a timely manner unless they use burdensome invasive sensors, obtaining labeled training data for each timestamp is not a realistic goal. To deal with this difficulty, we formulate the task as a weakly supervised learning. We only need to add labels for each complete trip, not for every timestamp independently. By assuming that some aspects of driver drowsiness increase over time due to tiredness, we formulate an algorithm that can learn from such weakly labeled data. We derive a scalable stochastic optimization method as a way of implementing the algorithm. Numerical experiments on real driving datasets demonstrate the advantages of our algorithm against baseline methods.


Generalized Multi-view Shared Subspace Learning using View Bootstrapping

arXiv.org Machine Learning

A key objective in multi-view learning is to model the information common to multiple parallel views of a class of objects/events to improve downstream learning tasks. In this context, two open research questions remain: How can we model hundreds of views per event? Can we learn robust multi-view embeddings without any knowledge of how these views are acquired? We present a neural method based on multi-view correlation to capture the information shared across a large number of views by subsampling them in a view-agnostic manner during training. To provide an upper bound on the number of views to subsample for a given embedding dimension, we analyze the error of the bootstrapped multi-view correlation objective using matrix concentration theory. Our experiments on spoken word recognition, 3D object classification and pose-invariant face recognition demonstrate the robustness of view bootstrapping to model a large number of views. Results underscore the applicability of our method for a view-agnostic learning setting.


Goal Recognition over Imperfect Domain Models

arXiv.org Artificial Intelligence

Goal recognition is the problem of recognizing the intended goal of autonomous agents or humans by observing their behavior in an environment. Over the past years, most existing approaches to goal and plan recognition have been ignoring the need to deal with imperfections regarding the domain model that formalizes the environment where autonomous agents behave. In this thesis, we introduce the problem of goal recognition over imperfect domain models, and develop solution approaches that explicitly deal with two distinct types of imperfect domains models: (1) incomplete discrete domain models that have possible, rather than known, preconditions and effects in action descriptions; and (2) approximate continuous domain models, where the transition function is approximated from past observations and not well-defined. We develop novel goal recognition approaches over imperfect domains models by leveraging and adapting existing recognition approaches from the literature. Experiments and evaluation over these two types of imperfect domains models show that our novel goal recognition approaches are accurate in comparison to baseline approaches from the literature, at several levels of observability and imperfections.


Microsoft and Intel develop antivirus software that turns malware into 2D images

Daily Mail - Science & tech

Microsoft and Intel have partnered up in an effort to develop a new kind of malware detection. The project, called Static Malware-as-Image Network Analysis (STAMINA), is a joint effort by the tech giants to develop a software that sniffs out malicious code by converting it into greyscale images that can be assessed by utilizing deep-learning. Specifically, STAMINA converts one-dimensional malware bits into two-dimensional greyscale images and then'looks' at the images for patterns that may indicate specific types of malicious code using computer vision software designed to analyze images. One the image is assembled, STAMINA then resizes it into a smaller dimension to make it easier to view. This compressions, according to researchers helps avoid needing the software to assess billions of pixels - which would likely slow the process - and does not negatively affect its ability to identify malware.


Microsoft and Intel turn malware into images to help spot more threats

Engadget

Microsoft and Intel have a novel approach to classifying malware: visualizing it. They're collaborating on STAMINA (Static Malware-as-Image Network Analysis), a project that turns rogue code into grayscale images so that a deep learning system can study them. The approach converts the binary form of an input file into a simple stream of pixels, and turns that into a picture with dimensions that vary depending on aspects like file size. A trained neural network then determines what (if anything) has infected the file. ZDNet noted that the AI is trained on the huge amount of data Microsoft has collected from Windows Defenders installations. The technology doesn't need full-size, pixel-by-pixel recreations of viruses, which makes sense when large malware could easily translate to gigantic pictures.


System-Level Predictive Maintenance: Review of Research Literature and Gap Analysis

arXiv.org Artificial Intelligence

This paper reviews current literature in the field of predictive maintenance from the system point of view. We differentiate the existing capabilities of condition estimation and failure risk forecasting as currently applied to simple components, from the capabilities needed to solve the same tasks for complex assets. System-level analysis faces more complex latent degradation states, it has to comprehensively account for active maintenance programs at each component level and consider coupling between different maintenance actions, while reflecting increased monetary and safety costs for system failures. As a result, methods that are effective for forecasting risk and informing maintenance decisions regarding individual components do not readily scale to provide reliable sub-system or system level insights. A novel holistic modeling approach is needed to incorporate available structural and physical knowledge and naturally handle the complexities of actively fielded and maintained assets.


Prior choice affects ability of Bayesian neural networks to identify unknowns

arXiv.org Artificial Intelligence

Deep Bayesian neural networks (BNNs) are a powerful tool, though computationally demanding, to perform parameter estimation while jointly estimating uncertainty around predictions. BNNs are typically implemented using arbitrary normal-distributed prior distributions on the model parameters. Here, we explore the effects of different prior distributions on classification tasks in BNNs and evaluate the evidence supporting the predictions based on posterior probabilities approximated by Markov Chain Monte Carlo sampling and by computing Bayes factors. We show that the choice of priors has a substantial impact on the ability of the model to confidently assign data to the correct class (true positive rates). Prior choice also affects significantly the ability of a BNN to identify out-of-distribution instances as unknown (false positive rates). When comparing our results against neural networks (NN) with Monte Carlo dropout we found that BNNs generally outperform NNs. Finally, in our tests we did not find a single best choice as prior distribution. Instead, each dataset yielded the best results under a different prior, indicating that testing alternative options can improve the performance of BNNs.


Interpretable random forest models through forward variable selection

arXiv.org Machine Learning

Random forest is a popular prediction approach for handling high dimensional covariates. However, it often becomes infeasible to interpret the obtained high dimensional and non-parametric model. Aiming for obtaining an interpretable predictive model, we develop a forward variable selection method using the continuous ranked probability score (CRPS) as the loss function. Our stepwise procedure leads to a smallest set of variables that optimizes the CRPS risk by performing at each step a hypothesis test on a significant decrease in CRPS risk. We provide mathematical motivation for our method by proving that in population sense the method attains the optimal set. Additionally, we show that the test is consistent provided that the random forest estimator of a quantile function is consistent. In a simulation study, we compare the performance of our method with an existing variable selection method, for different sample sizes and different correlation strength of covariates. Our method is observed to have a much lower false positive rate. We also demonstrate an application of our method to statistical post-processing of daily maximum temperature forecasts in the Netherlands. Our method selects about 10% covariates while retaining the same predictive power.


A Compressive Classification Framework for High-Dimensional Data

arXiv.org Machine Learning

We propose a compressive classification framework for settings where the data dimensionality is significantly higher than the sample size. The proposed method, referred to as compressive regularized discriminant analysis (CRDA) is based on linear discriminant analysis and has the ability to select significant features by using joint-sparsity promoting hard thresholding in the discriminant rule. Since the number of features is larger than the sample size, the method also uses state-of-the-art regularized sample covariance matrix estimators. Several analysis examples on real data sets, including image, speech signal and gene expression data illustrate the promising improvements offered by the proposed CRDA classifier in practise. Overall, the proposed method gives fewer misclassification errors than its competitors, while at the same time achieving accurate feature selection results. The open-source R package and MA TLAB toolbox of the proposed method (named compressiveRDA) is freely available. High-dimensional (HD) classification is at the core of numerous contemporary statistical studies. An increasingly common occurrence is the collection of large amounts of information on each individual sample point, even though the number of sample points themselves may remain relatively small. Typical examples are gene expression and protein mass spectrometry data, and other areas of computational biology. Regularization and shrinkage are commonly used tools in many applications such as regression or classification to overcome significant statistical challenges posed particularly due to the huge-dimension, low-sample-size (HDLSS) data settings in which the number of features, p, is often several magnitudes larger than the sample size, n (i.e., p null n).