Goto

Collaborating Authors

 Schumann, Anika


From Time Series to Euclidean Spaces: On Spatial Transformations for Temporal Clustering

arXiv.org Machine Learning

Unsupervised clustering of temporal data is both challenging and crucial in machine learning. In this paper, we show that neither traditional clustering methods, time series specific or even deep learning-based alternatives generalise well when both varying sampling rates and high dimensionality are present in the input data. We propose a novel approach to temporal clustering, in which we (1) transform the input time series into a distance-based projected representation by using similarity measures suitable for dealing with temporal data,(2) feed these projections into a multi-layer CNN-GRU autoencoder to generate meaningful domain-aware latent representations, which ultimately (3) allow for a natural separation of clusters beneficial for most important traditional clustering algorithms. We evaluate our approach on time series datasets from various domains and show that it not only outperforms existing methods in all cases, by up to 32%, but is also robust and incurs negligible computation overheads.


Explainable Failure Predictions with RNN Classifiers based on Time Series Data

arXiv.org Machine Learning

Given key performance indicators collected with fine granularity as time series, our aim is to predict and explain failures in storage environments. Although explainable predictive modeling based on spiky telemetry data is key in many domains, current approaches cannot tackle this problem. Deep learning methods suitable for sequence modeling and learning temporal dependencies, such as RNNs, are effective, but opaque from an explainability perspective. Our approach first extracts the anomalous spikes from time series as events and then builds an RNN classifier with attention mechanisms to embed the irregularity and frequency of these events. A preliminary evaluation on real world storage environments shows that our approach can predict failures within a 3-day prediction window with comparable accuracy as traditional RNN-based classifiers. At the same time it can explain the predictions by returning the key anomalous events which led to those failure predictions.


From Semantic Models to Cognitive Buildings

AAAI Conferences

Today's operation of buildings is either based on simple dashboards that are not scalable to thousands of sensor data or on rules that provide very limited fault information only. In either case considerable manual effort is required for diagnosing building operation problems related to energy usage or occupant comfort. We present a Cognitive Building demo that uses (i) semantic reasoning to model physical relationships of sensors and systems, (ii) machine learning to predict and detect anomalies in energy flow, occupancy and user comfort, and (iii) speech-enabled Augmented Reality interfaces for immersive interaction with thousands of devices. Our demo analyzes data from more than 3,300 sensors and shows how we can automatically diagnose building operation problems.


Minimizing User Involvement for Accurate Ontology Matching Problems

AAAI Conferences

Many various types of sensors coming from different complex devices collect data from a city. Their underlying data representation follows specific manufacturer specifications that have possibly incomplete descriptions (in ontology) alignments. This paper addresses the problem of determining accurate and complete matching of ontologies given some common descriptions and their pre-determined high level alignments. In this context the problem of ontology matching consists of automatically determining all matching given the latter alignments, and manually verifying the matching results. Especially for applications where it is crucial that ontologies are matched correctly the latter can turn into a very time-consuming task for the user. This paper tackles this challenge and addresses the problem of computing the minimum number of user inputs needed to verify all matchings. We show how to represent this problem as a reasoning problem over a bipartite graph and how to encode it over pseudo Boolean constraints. Experiments show that our approach can be successfully applied to real-world data sets.


Adaptable Fault Identification for Smart Buildings

AAAI Conferences

Malfunctioning HVAC equipment in commercial buildings wastes between 15% and 30% of energy. Many diagnosis approaches tackle this problem, but they either suffer from a lack of detailed fault information or a lack of adaptability to different buildings and equipment. Clearly, especially in the light of an ever increasing amount of sensor data that is available in heavily metered smart buildings, easily adaptable self learning in-depth diagnosis approaches are needed. This paper addresses the challenges of developing such approaches and describes the contribution artificial intelligence techniques like transfer learning, ontologies, knowledge representation or diagnosis can make in overcoming these challenges.


Computing Cost-Optimal Definitely Discriminating Tests

AAAI Conferences

The goal of testing is to discriminate between multiple hypotheses about a system - for example, different fault diagnoses - by applying input patterns and verifying or falsifying the hypotheses from the observed outputs. Definitely discriminating tests (DDTs) are those input patterns that are guaranteed to discriminate between different hypotheses of non-deterministic systems. Finding DDTs is important in practice, but can be very expensive. Even more challenging is the problem of finding a DDT that minimizes the cost of the testing process, i.e., an input pattern that can be most cheaply enforced and that is a DDT. This paper addresses both problems. We show how we can transform a given problem into a Boolean structure in decomposable negation normal form (DNNF), and extract from it a Boolean formula whose models correspond to DDTs. This allows us to harness recent advances in both knowledge compilation and satisfiability for efficient and scalable DDT computation in practice. Furthermore, we show how we can generate a DNNF structure compactly encoding all DDTs of the problem and use it to obtain a cost-optimal DDT in time linear in the size of the structure. Experimental results from a real-world application show that our method can compute DDTs in less than 1 second for instances that were previously intractable, and cost-optimal DDTs in less than 20 seconds where previous approaches could not even compute an arbitrary DDT.