Goto

Collaborating Authors

 Accuracy


Goal Recognition over Imperfect Domain Models

arXiv.org Artificial Intelligence

Goal recognition is the problem of recognizing the intended goal of autonomous agents or humans by observing their behavior in an environment. Over the past years, most existing approaches to goal and plan recognition have been ignoring the need to deal with imperfections regarding the domain model that formalizes the environment where autonomous agents behave. In this thesis, we introduce the problem of goal recognition over imperfect domain models, and develop solution approaches that explicitly deal with two distinct types of imperfect domains models: (1) incomplete discrete domain models that have possible, rather than known, preconditions and effects in action descriptions; and (2) approximate continuous domain models, where the transition function is approximated from past observations and not well-defined. We develop novel goal recognition approaches over imperfect domains models by leveraging and adapting existing recognition approaches from the literature. Experiments and evaluation over these two types of imperfect domains models show that our novel goal recognition approaches are accurate in comparison to baseline approaches from the literature, at several levels of observability and imperfections.


Microsoft and Intel develop antivirus software that turns malware into 2D images

Daily Mail - Science & tech

Microsoft and Intel have partnered up in an effort to develop a new kind of malware detection. The project, called Static Malware-as-Image Network Analysis (STAMINA), is a joint effort by the tech giants to develop a software that sniffs out malicious code by converting it into greyscale images that can be assessed by utilizing deep-learning. Specifically, STAMINA converts one-dimensional malware bits into two-dimensional greyscale images and then'looks' at the images for patterns that may indicate specific types of malicious code using computer vision software designed to analyze images. One the image is assembled, STAMINA then resizes it into a smaller dimension to make it easier to view. This compressions, according to researchers helps avoid needing the software to assess billions of pixels - which would likely slow the process - and does not negatively affect its ability to identify malware.


Microsoft and Intel turn malware into images to help spot more threats

Engadget

Microsoft and Intel have a novel approach to classifying malware: visualizing it. They're collaborating on STAMINA (Static Malware-as-Image Network Analysis), a project that turns rogue code into grayscale images so that a deep learning system can study them. The approach converts the binary form of an input file into a simple stream of pixels, and turns that into a picture with dimensions that vary depending on aspects like file size. A trained neural network then determines what (if anything) has infected the file. ZDNet noted that the AI is trained on the huge amount of data Microsoft has collected from Windows Defenders installations. The technology doesn't need full-size, pixel-by-pixel recreations of viruses, which makes sense when large malware could easily translate to gigantic pictures.


Prior choice affects ability of Bayesian neural networks to identify unknowns

arXiv.org Artificial Intelligence

Deep Bayesian neural networks (BNNs) are a powerful tool, though computationally demanding, to perform parameter estimation while jointly estimating uncertainty around predictions. BNNs are typically implemented using arbitrary normal-distributed prior distributions on the model parameters. Here, we explore the effects of different prior distributions on classification tasks in BNNs and evaluate the evidence supporting the predictions based on posterior probabilities approximated by Markov Chain Monte Carlo sampling and by computing Bayes factors. We show that the choice of priors has a substantial impact on the ability of the model to confidently assign data to the correct class (true positive rates). Prior choice also affects significantly the ability of a BNN to identify out-of-distribution instances as unknown (false positive rates). When comparing our results against neural networks (NN) with Monte Carlo dropout we found that BNNs generally outperform NNs. Finally, in our tests we did not find a single best choice as prior distribution. Instead, each dataset yielded the best results under a different prior, indicating that testing alternative options can improve the performance of BNNs.


Interpretable random forest models through forward variable selection

arXiv.org Machine Learning

Random forest is a popular prediction approach for handling high dimensional covariates. However, it often becomes infeasible to interpret the obtained high dimensional and non-parametric model. Aiming for obtaining an interpretable predictive model, we develop a forward variable selection method using the continuous ranked probability score (CRPS) as the loss function. Our stepwise procedure leads to a smallest set of variables that optimizes the CRPS risk by performing at each step a hypothesis test on a significant decrease in CRPS risk. We provide mathematical motivation for our method by proving that in population sense the method attains the optimal set. Additionally, we show that the test is consistent provided that the random forest estimator of a quantile function is consistent. In a simulation study, we compare the performance of our method with an existing variable selection method, for different sample sizes and different correlation strength of covariates. Our method is observed to have a much lower false positive rate. We also demonstrate an application of our method to statistical post-processing of daily maximum temperature forecasts in the Netherlands. Our method selects about 10% covariates while retaining the same predictive power.


A Compressive Classification Framework for High-Dimensional Data

arXiv.org Machine Learning

We propose a compressive classification framework for settings where the data dimensionality is significantly higher than the sample size. The proposed method, referred to as compressive regularized discriminant analysis (CRDA) is based on linear discriminant analysis and has the ability to select significant features by using joint-sparsity promoting hard thresholding in the discriminant rule. Since the number of features is larger than the sample size, the method also uses state-of-the-art regularized sample covariance matrix estimators. Several analysis examples on real data sets, including image, speech signal and gene expression data illustrate the promising improvements offered by the proposed CRDA classifier in practise. Overall, the proposed method gives fewer misclassification errors than its competitors, while at the same time achieving accurate feature selection results. The open-source R package and MA TLAB toolbox of the proposed method (named compressiveRDA) is freely available. High-dimensional (HD) classification is at the core of numerous contemporary statistical studies. An increasingly common occurrence is the collection of large amounts of information on each individual sample point, even though the number of sample points themselves may remain relatively small. Typical examples are gene expression and protein mass spectrometry data, and other areas of computational biology. Regularization and shrinkage are commonly used tools in many applications such as regression or classification to overcome significant statistical challenges posed particularly due to the huge-dimension, low-sample-size (HDLSS) data settings in which the number of features, p, is often several magnitudes larger than the sample size, n (i.e., p null n).


Replication Markets: Results, Lessons, Challenges and Opportunities in AI Replication

arXiv.org Artificial Intelligence

The last decade saw the emergence of systematic large-scale replication projects in the social and behavioral sciences, (Camerer et al., 2016, 2018; Ebersole et al., 2016; Klein et al., 2014, 2018; Collaboration, 2015). These projects were driven by theoretical and conceptual concerns about a high fraction of "false positives" in the scientific publications (Ioannidis, 2005) (and a high prevalence of "questionable research practices" (Simmons, Nelson, and Simonsohn, 2011). Concerns about the credibility of research findings are not unique to the behavioral and social sciences; within Computer Science, Artificial Intelligence (AI) and Machine Learning (ML) are areas of particular concern (Lucic et al., 2018; Freire, Bonnet, and Shasha, 2012; Gundersen and Kjensmo, 2018; Henderson et al., 2018). Given the pioneering role of the behavioral and social sciences in the promotion of novel methodologies to improve the credibility of research, it is a promising approach to analyze the lessons learned from this field and adjust strategies for Computer Science, AI and ML In this paper, we review approaches used in the behavioral and social sciences and in the DARPA SCORE project. We particularly focus on the role of human forecasting of replication outcomes, and how forecasting can leverage the information gained from relatively labor and resource-intensive replications. We will discuss opportunities and challenges of using these approaches to monitor and improve the credibility of research areas in Computer Science, AI, and ML.


SAIA: Split Artificial Intelligence Architecture for Mobile Healthcare System

arXiv.org Artificial Intelligence

As the advancement of deep learning (DL), the Internet of Things and cloud computing techniques for biomedical and healthcare problems, mobile healthcare systems have received unprecedented attention. Since DL techniques usually require enormous amount of computation, most of them cannot be directly deployed on the resource-constrained mobile and IoT devices. Hence, most of the mobile healthcare systems leverage the cloud computing infrastructure, where the data collected by the mobile and IoT devices would be transmitted to the cloud computing platforms for analysis. However, in the contested environments, relying on the cloud might not be practical at all times. For instance, the satellite communication might be denied or disrupted. We propose SAIA, a Split Artificial Intelligence Architecture for mobile healthcare systems. Unlike traditional approaches for artificial intelligence (AI) which solely exploits the computational power of the cloud server, SAIA could not only relies on the cloud computing infrastructure while the wireless communication is available, but also utilizes the lightweight AI solutions that work locally on the client side, hence, it can work even when the communication is impeded. In SAIA, we propose a meta-information based decision unit, that could tune whether a sample captured by the client should be operated by the embedded AI (i.e., keeping on the client) or the networked AI (i.e., sending to the server), under different conditions. In our experimental evaluation, extensive experiments have been conducted on two popular healthcare datasets. Our results show that SAIA consistently outperforms its baselines in terms of both effectiveness and efficiency.


Coronavirus Update: GOP Senators Disagree With Trump On COVID-19 Testing, 'There Are Still Shortfalls'

International Business Times

Republican senators are saying out loud the extent of mass testing for COVID-19 in the United States isn't where it should be -- not by a long shot -- and contradict president Donald Trump's oft repeated claims the U.S. has so much testing available. "We have so much testing," claimed Trump Thursday. Mass testing is one of the only few known ways to end the COVID-19 pandemic in this country. The U.S. has conducted only 8.1 million tests since February. The White House says its goal is two million tests per week per state by the end of May.


In Pursuit of Interpretable, Fair and Accurate Machine Learning for Criminal Recidivism Prediction

arXiv.org Machine Learning

In recent years, academics and investigative journalists have criticized certain commercial risk assessments for their black-box nature and failure to satisfy competing notions of fairness. Since then, the field of interpretable machine learning has created simple yet effective algorithms, while the field of fair machine learning has proposed various mathematical definitions of fairness. However, studies from these fields are largely independent, despite the fact that many applications of machine learning to social issues require both fairness and interpretability. We explore the intersection by revisiting the recidivism prediction problem using state-of-the-art tools from interpretable machine learning, and assessing the models for performance, interpretability, and fairness. Unlike previous works, we compare against two existing risk assessments (COMPAS and the Arnold Public Safety Assessment) and train models that output probabilities rather than binary predictions. We present multiple models that beat these risk assessments in performance, and provide a fairness analysis of these models. Our results imply that machine learning models should be trained separately for separate locations, and updated over time.