Accuracy
Towards Generation of Visual Attention Map for Source Code
Itoh, Takeshi D., Kubo, Takatomi, Ikeda, Kiyoka, Maruno, Yuki, Ikutani, Yoshiharu, Hata, Hideaki, Matsumoto, Kenichi, Ikeda, Kazushi
Program comprehension is a dominant process in software development and maintenance. Experts are considered to comprehend the source code efficiently by directing their gaze, or attention, to important components in it. However, reflecting importance of components is still a remaining issue in gaze behavior analysis for source code comprehension. Here we show a conceptual framework to compare the quantified importance of source code components with gaze behavior of programmers. We use "attention" in attention models (e.g., code2vec) as the importance indices for source code components and evaluate programmers' gaze locations based on the quantified importance. In this report, we introduce the idea of our gaze behavior analysis using the attention map, and the results of a preliminary experiment.
Image Evolution Trajectory Prediction and Classification from Baseline using Learning-based Patch Atlas Selection for Early Diagnosis
Patients initially diagnosed with early mild cognitive impairment (eMCI) are known to be a clinically heterogeneous group with very subtle patterns of brain atrophy. To examine the boarders between normal controls (NC) and eMCI, Magnetic Resonance Imaging (MRI) was extensively used as a non-invasive imaging modality to pin-down subtle changes in brain images of MCI patients. However, eMCI research remains limited by the number of available MRI acquisition timepoints. Ideally, one would learn how to diagnose MCI patients in an early stage from MRI data acquired at a single timepoint, while leveraging 'non-existing' follow-up observations. To this aim, we propose novel supervised and unsupervised frameworks that learn how to jointly predict and label the evolution trajectory of intensity patches, each seeded at a specific brain landmark, from a baseline intensity patch. Specifically, both strategies aim to identify the best training atlas patches at baseline timepoint to predict and classify the evolution trajectory of a given testing baseline patch. The supervised technique learns how to select the best atlas patches by training bidirectional mappings from the space of pairwise patch similarities to their corresponding prediction errors -when one patch was used to predict the other. On the other hand, the unsupervised technique learns a manifold of baseline atlas and testing patches using multiple kernels to well capture patch distributions at multiple scales. Once the best baseline atlas patches are selected, we retrieve their evolution trajectories and average them to predict the evolution trajectory of the testing baseline patch. Next, we input the predicted trajectories to an ensemble of linear classifiers, each trained at a specific landmark. Our classification accuracy increased by up to 10% points in comparison to single timepoint-based classification methods.
Mid-price Prediction Based on Machine Learning Methods with Technical and Quantitative Indicators
Ntakaris, Adamantios, Kanniainen, Juho, Gabbouj, Moncef, Iosifidis, Alexandros
Stock price prediction is a challenging task, but machine learning methods have recently been used successfully for this purpose. In this paper, we extract over 270 hand-crafted features (factors) inspired by technical and quantitative analysis and tested their validity on short-term mid-price movement prediction. We focus on a wrapper feature selection method using entropy, least-mean squares, and linear discriminant analysis. We also build a new quantitative feature based on adaptive logistic regression for online learning, which is constantly selected first among the majority of the proposed feature selection methods. This study examines the best combination of features using high frequency limit order book data from Nasdaq Nordic. Our results suggest that sorting methods and classifiers can be used in such a way that one can reach the best performance with a combination of only very few advanced hand-crafted features.
A Novel Approach for Detection and Ranking of Trendy and Emerging Cyber Threat Events in Twitter Streams
Bose, Avishek, Behzadan, Vahid, Aguirre, Carlos, Hsu, William H.
We present a new machine learning and text information extraction approach to detection of cyber threat events in Twitter that are novel (previously non-extant) and developing (marked by significance with respect to similarity with a previously detected event). While some existing approaches to event detection measure novelty and trendiness, typically as independent criteria and occasionally as a holistic measure, this work focuses on detecting both novel and developing events using an unsupervised machine learning approach. Furthermore, our proposed approach enables the ranking of cyber threat events based on an importance score by extracting the tweet terms that are characterized as named entities, keywords, or both. We also impute influence to users in order to assign a weighted score to noun phrases in proportion to user influence and the corresponding event scores for named entities and keywords. To evaluate the performance of our proposed approach, we measure the efficiency and detection error rate for events over a specified time interval, relative to human annotator ground truth.
Safe Policy Improvement with Soft Baseline Bootstrapping
Nadjahi, Kimia, Laroche, Romain, Combes, Rรฉmi Tachet des
Batch Reinforcement Learning (Batch RL) consists in training a policy using trajectories collected with another policy, called the behavioural policy. Safe policy improvement (SPI) provides guarantees with high probability that the trained policy performs better than the behavioural policy, also called baseline in this setting. Previous work shows that the SPI objective improves mean performance as compared to using the basic RL objective, which boils down to solving the MDP with maximum likelihood. Here, we build on that work and improve more precisely the SPI with Baseline Bootstrapping algorithm (SPIBB) by allowing the policy search over a wider set of policies. Instead of binarily classifying the state-action pairs into two sets (the \textit{uncertain} and the \textit{safe-to-train-on} ones), we adopt a softer strategy that controls the error in the value estimates by constraining the policy change according to the local model uncertainty. The method can take more risks on uncertain actions all the while remaining provably-safe, and is therefore less conservative than the state-of-the-art methods. We propose two algorithms (one optimal and one approximate) to solve this constrained optimization problem and empirically show a significant improvement over existing SPI algorithms both on finite MDPs and on infinite MDPs with a neural network function approximation.
Learning a Behavior Model of Hybrid Systems Through Combining Model-Based Testing and Machine Learning (Full Version)
Aichernig, Bernhard K., Bloem, Roderick, Ebrahimi, Masoud, Horn, Martin, Pernkopf, Franz, Roth, Wolfgang, Rupp, Astrid, Tappler, Martin, Tranninger, Markus
Models play an essential role in the design process of cyber-physical systems. They form the basis for simulation and analysis and help in identifying design problems as early as possible. However, the construction of models that comprise physical and digital behavior is challenging. Therefore, there is considerable interest in learning such hybrid behavior by means of machine learning which requires sufficient and representative training data covering the behavior of the physical system adequately. In this work, we exploit a combination of automata learning and model-based testing to generate sufficient training data fully automatically. Experimental results on a platooning scenario show that recurrent neural networks learned with this data achieved significantly better results compared to models learned from randomly generated data. In particular, the classification error for crash detection is reduced by a factor of five and a similar F1-score is obtained with up to three orders of magnitude fewer training samples.
Data-driven Policy on Feasibility Determination for the Train Shunting Problem
da Costa, Paulo R. de O., Rhuggenaath, J., Zhang, Y., Akcay, A., Lee, W., Kaymak, U.
Parking, matching, scheduling, and routing are common problems in train maintenance. In particular, train units are commonly maintained and cleaned at dedicated shunting yards. The planning problem that results from such situations is referred to as the Train Unit Shunting Problem (TUSP). This problem involves matching arriving train units to service tasks and determining the schedule for departing trains. The TUSP is an important problem as it is used to determine the capacity of shunting yards and arises as a sub-problem of more general scheduling and planning problems. In this paper, we consider the case of the Dutch Railways (NS) TUSP. As the TUSP is complex, NS currently uses a local search (LS) heuristic to determine if an instance of the TUSP has a feasible solution. Given the number of shunting yards and the size of the planning problems, improving the evaluation speed of the LS brings significant computational gain. In this work, we use a machine learning approach that complements the LS and accelerates the search process. We use a Deep Graph Convolutional Neural Network (DGCNN) model to predict the feasibility of solutions obtained during the run of the LS heuristic. We use this model to decide whether to continue or abort the search process. In this way, the computation time is used more efficiently as it is spent on instances that are more likely to be feasible. Using simulations based on real-life instances of the TUSP, we show how our approach improves upon the previous method on prediction accuracy and leads to computational gains for the decision-making process.
Contextual One-Class Classification in Data Streams
Moulton, Richard Hugh, Viktor, Herna L., Japkowicz, Nathalie, Gama, Joรฃo
In machine learning, the one-class classification problem occurs when training instances are only available from one class. It has been observed that making use of this class's structure, or its different contexts, may improve one-class classifier performance. Although this observation has been demonstrated for static data, a rigorous application of the idea within the data stream environment is lacking. To address this gap, we propose the use of context to guide one-class classifier learning in data streams, paying particular attention to the challenges presented by the dynamic learning environment. We present three frameworks that learn contexts and conduct experiments with synthetic and benchmark data streams. We conclude that the paradigm of contexts in data streams can be used to improve the performance of streaming one-class classifiers.
The What-If Tool: Interactive Probing of Machine Learning Models
Wexler, James, Pushkarna, Mahima, Bolukbasi, Tolga, Wattenberg, Martin, Viegas, Fernanda, Wilson, Jimbo
A key challenge in developing and deploying Machine Learning (ML) systems is understanding their performance across a wide range of inputs. To address this challenge, we created the What-If Tool, an open-source application that allows practitioners to probe, visualize, and analyze ML systems, with minimal coding. The What-If Tool lets practitioners test performance in hypothetical situations, analyze the importance of different data features, and visualize model behavior across multiple models and subsets of input data. It also lets practitioners measure systems according to multiple ML fairness metrics. We describe the design of the tool, and report on real-life usage at different organizations.
A Robust Two-Sample Test for Time Series data
Bellot, Alexis, van der Schaar, Mihaela
We develop a general framework for hypothesis testing with time series data. The problem is to distinguish between the mean functions of the underlying temporal processes of populations of times series, which are often irregularly sampled and measured with error. Such an observation pattern can result in substantial uncertainty about the underlying trajectory, quantifying it accurately is important to ensure robust tests. We propose a new test statistic that views each trajectory as a sample from a distribution on functions and considers the distributions themselves to encode the uncertainty between observations. We derive asymptotic null distributions and power functions for our test and put emphasis on computational considerations by giving an efficient kernel learning framework to prevent over-fitting in small samples and also showing how to scale our test to densely sampled time series. We conclude with performance evaluations on synthetic data and experiments on healthcare and climate change data.