Performance Analysis
Contextual Outlier Detection in Continuous-Time Event Sequences
Continuous-time event sequences represent discrete events occurring in continuous time. Such sequences arise frequently in real-life and cover a wide variety of natural events, such as earthquakes, or events corresponding to human actions, such as medical administrations. Usually we expect the event sequences to follow some regular pattern over time. However, sometimes these regular patterns may be interrupted by unexpected absence or unexpected occurrences of events. Identification of these unexpected cases can be very important as they may point to abnormal situations that need human attention. In this work, we study and develop methods for detecting outliers in continuous-time event sequences, including unexpected absence and unexpected occurrences of events. Since the patterns that event sequences tend to follow may change in different contexts, we develop outlier detection methods based on point processes that take into account different contexts. Our outlier scoring methods are based on Bayesian decision theory and hypothesis testing with theoretical guarantees. To test the performance of the methods, we conduct experiments on both synthetic data and real-world clinical data and show the effectiveness of the proposed methods.
Mislabel Detection of Finnish Publication Ranks
Akusok, Anton, Saarela, Mirka, Kรคrkkรคinen, Tommi, Bjรถrk, Kaj-Mikael, Lendasse, Amaury
Finland, in the spirit of Norway and Denmark, introduced ranking system for academic publication channels (referring to scientific journals, conference series, book publishers etc.) called as Jufo (i.e. "Julkaisufoorumi" in Finnish, "Publication Forum" in English) in 2010, together with the renewed university legislation. The ranking of a publication channel, ranging from 0 (non-peer- reviewed) to 3 (most distinguished academic publication forums), is decided by a specially nominated panel of a particular scientific discipline. These panels decide the rankings based on their academic expertise in regular meetings. Because the rankings are directly linked to the allocated funding of the universities, there has been and is a lot of discussion about the fairness and objectivity of the ranks. A versatile analysis of the 2015 Jufo-rankings was done in [10]. There, by using association rule mining, decision trees, and confusion matrices with respect to Norwegian and Danish ranks, it was shown that most of the expert-based rankings could be predicted and explained with machine learning methods. Moreover, it was found out that those publication channels, for which the Finnish expert-based rank is higher than the estimated one, are characterized by higher publication activity or recent upgrade of the rank. Hence, the outcomes of the system, the publication ranks, need to be assessed and evaluated regularly and rigorously. 1
Per-sample Prediction Intervals for Extreme Learning Machines
Akusok, Anton, Miche, Yoan, Bjรถrk, Kaj-Mikael, Lendasse, Amaury
Prediction intervals in supervised Machine Learning bound the region where the true outputs of new samples may fall. They are necessary in the task of separating reliable predictions of a trained model from near random guesses, minimizing the rate of False Positives, and other problem-specific tasks in applied Machine Learning. Many real problems have heteroscedastic stochastic outputs, which explains the need of input-dependent prediction intervals. This paper proposes to estimate the input-dependent prediction intervals by a separate Extreme Learning Machine model, using variance of its predictions as a correction term accounting for the model uncertainty. The variance is estimated from the model's linear output layer with a weighted Jackknife method. The methodology is very fast, robust to heteroscedastic outputs, and handles both extremely large datasets and insufficient amount of training data.
Interactive Open-Ended Learning for 3D Object Recognition
The thesis contributes in several important ways to the research area of 3D object category learning and recognition. To cope with the mentioned limitations, we look at human cognition, in particular at the fact that human beings learn to recognize object categories ceaselessly over time. This ability to refine knowledge from the set of accumulated experiences facilitates the adaptation to new environments. Inspired by this capability, we seek to create a cognitive object perception and perceptual learning architecture that can learn 3D object categories in an open-ended fashion. In this context, ``open-ended'' implies that the set of categories to be learned is not known in advance, and the training instances are extracted from actual experiences of a robot, and thus become gradually available, rather than being available since the beginning of the learning process. In particular, this architecture provides perception capabilities that will allow robots to incrementally learn object categories from the set of accumulated experiences and reason about how to perform complex tasks. This framework integrates detection, tracking, teaching, learning, and recognition of objects. An extensive set of systematic experiments, in multiple experimental settings, was carried out to thoroughly evaluate the described learning approaches. Experimental results show that the proposed system is able to interact with human users, learn new object categories over time, as well as perform complex tasks. The contributions presented in this thesis have been fully implemented and evaluated on different standard object and scene datasets and empirically evaluated on different robotic platforms.
Feature-wise change detection and robust indoor positioning using RANSAC-like approach
Fingerprinting-based positioning, one of the promising indoor positioning solutions, has been broadly explored owing to the pervasiveness of sensor-rich mobile devices, the prosperity of opportunistically measurable location-relevant signals and the progress of data-driven algorithms. One critical challenge is to controland improve the quality of the reference fingerprint map (RFM), which is built at the offline stage and applied for online positioning. The key concept concerningthe quality control of the RFM is updating the RFM according to the newly measured data. Though varies methods have been proposed for adapting the RFM, they approach the problem by introducing extra-positioning schemes (e.g. PDR orUGV) and directly adjust the RFM without distinguishing whether critical changes have occurred. This paper aims at proposing an extra-positioning-free solution by making full use of the redundancy of measurable features. Loosely inspired by random sampling consensus (RANSAC), arbitrarily sampled subset of features from the online measurement are used for generating multi-resamples, which areused for estimating the intermediate locations. In the way of resampling, it can mitigate the impact of the changed features on positioning and enables to retrieve accurate location estimation. The users location is robustly computed by identifying the candidate locations from these intermediate ones using modified Jaccardindex (MJI) and the feature-wise change belief is calculated according to the world model of the RFM and the estimated variability of features. In order to validate our proposed approach, two levels of experimental analysis have been carried out. On the simulated dataset, the average change detection accuracy is about 90%. Meanwhile, the improvement of positioning accuracy within 2 m is about 20% by dropping out the features that are detected as changed when performing positioning comparing to that of using all measured features for location estimation. On the long-term collected dataset, the average change detection accuracy is about 85%.
Comparison of Classification Methods for Very High-Dimensional Data in Sparse Random Projection Representation
Machine learning is a mature scientific field with lots of theoretical results, established algorithms and processes that address various supervised and unsupervised problems using the provided data. In theoretical research, such data is generated in a convenient way, or various methods are compared on standard benchmark problems - where data samples are represented as dense real-valued vectors of fixed and relatively low length. Practical applications represented by such standard datasets can successfully be solved by one of a myriad of existing machine learning methods and their implementations. However, the most impact of machine learning is currently in the big data field with the problems that are well explained in natural language ("Find malicious files", "Is that website safe to browse?") but are hard to encode numerically. Data samples in these problems have distinct features coming from a huge unordered set of possible features. Same approach can cover a frequent case of missing feature values [10, 28].
Neural networks and kernel ridge regression for excited states dynamics of CH$_2$NH$_2^+$: From single-state to multi-state representations and multi-property machine learning models
Westermayr, Julia, Faber, Felix A., Christensen, Anders S., von Lilienfeld, O. Anatole, Marquetand, Philipp
Excited-state dynamics simulations are a powerful tool to investigate photo-induced reactions of molecules and materials and provide complementary information to experiments. Since the applicability of these simulation techniques is limited by the costs of the underlying electronic structure calculations, we develop and assess different machine learning models for this task. The machine learning models are trained on {\emph ab initio} calculations for excited electronic states, using the methylenimmonium cation (CH$_2$NH$_2^+$) as a model system. For the prediction of excited-state properties, multiple outputs are desirable, which is straightforward with neural networks but less explored with kernel ridge regression. We overcome this challenge for kernel ridge regression in the case of energy predictions by encoding the electronic states explicitly in the inputs, in addition to the molecular representation. We adopt this strategy also for our neural networks for comparison. Such a state encoding enables not only kernel ridge regression with multiple outputs but leads also to more accurate machine learning models for state-specific properties. An important goal for excited-state machine learning models is their use in dynamics simulations, which needs not only state-specific information but also couplings, i.e., properties involving pairs of states. Accordingly, we investigate the performance of different models for such coupling elements. Furthermore, we explore how combining all properties in a single neural network affects the accuracy. As an ultimate test for our machine learning models, we carry out excited-state dynamics simulations based on the predicted energies, forces and couplings and, thus, show the scopes and possibilities of machine learning for the treatment of electronically excited states.
Idiot's Guide to Precision, Recall and Confusion Matrix
Building Machine Learning models is fun, making sure we build the best ones is what makes a difference! RMSE is a good measure to evaluate how a machine learning model is performing. If RMSE is significantly higher in test set than training-set -- There is a good chance model is overfitting. You must be wondering'Can't we just use accuracy of the model as the holy grail metric?' Accuracy is very important, but it might not be the best metric all the time. Let's look at why with an example -: Let's have a dummy model which always predicts that a loan will not default.
Arithmetic, Geometric, and Harmonic Means for Machine Learning
Calculating the average of a variable or a list of numbers is a common operation in machine learning. It is an operation you may use every day either directly, such as when summarizing data, or indirectly, such as a smaller step in a larger procedure when fitting a model. The average is a synonym for the mean, a number that represents the most likely value from a probability distribution. As such, there are multiple different ways to calculate the mean based on the type of data that you're working with. This can trip you up if you use the wrong mean for your data.
A Heterogeneous Graphical Model to Understand User-Level Sentiments in Social Media
Iyer, Rahul Radhakrishnan, Chen, Jing, Sun, Haonan, Xu, Keyang
Social Media has seen a tremendous growth in the last decade and is continuing to grow at a rapid pace. With such adoption, it is increasingly becoming a rich source of data for opinion mining and sentiment analysis. The detection and analysis of sentiment in social media is thus a valuable topic and attracts a lot of research efforts. Most of the earlier efforts focus on supervised learning approaches to solve this problem, which require expensive human annotations and therefore limits their practical use. In our work, we propose a semi-supervised approach to predict user-level sentiments for specific topics. We define and utilize a heterogeneous graph built from the social networks of the users with the knowledge that connected users in social networks typically share similar sentiments. Compared with the previous works, we have several novelties: (1) we incorporate the influences/authoritativeness of the users into the model, 2) we include comment-based and like-based user-user links to the graph, 3) we superimpose multiple heterogeneous graphs into one thereby allowing multiple types of links to exist between two users.