Diagnosis
Diagnosing client faults using SVM-based intelligent inference from TCP packet traces
Widanapathirana, Chathuranga, Sekercioglu, Y. Ahmet, Fitzpatrick, Paul G., Ivanovich, Milosh V., Li, Jonathan C.
In recent years, technological developments in computer networking have predominantly focused on improving connection media speeds and state-of-the-art applications. In tandem with user demand for high-speed delivery of information, tolerance for performance and connectivity issues has decreased. Due to the complexity and scale of modern communications networks that include a multitude of possible client devices, traditional "expert knowledge" or "rule based" methods of performance and fault diagnosis are increasingly inefficient and infeasible. Analysis of packet traces, especially from the Transmission Control Protocol (TCP), is a sophisticated inference based technique used to diagnose complicated network problems in specialized cases. TCP traces contain artifacts related to behavioral characteristics of network elements that a skilled investigator can use to infer the location and root cause of a network fault.
Tracking Tetrahymena Pyriformis Cells using Decision Trees
Wang, Quan, Ou, Yan, Julius, A. Agung, Boyer, Kim L., Kim, Min Jun
Matching cells over time has long been the most difficult step in cell tracking. In this paper, we approach this problem by recasting it as a classification problem. W e construct a feature set for each cell, and compute a feature difference vector between a cell in the current frame and a cell in a previous frame. Then we determine whether the two cells represent the same cell over time by training decision trees as our binary classifiers. With the output of decision trees, we are able to formulate an assignment problem for our cell association task and solve it using a modified version of the Hungarian algorithm.
Discovery of non-gaussian linear causal models using ICA
Shimizu, Shohei, Hyvarinen, Aapo, Kano, Yutaka, Hoyer, Patrik O.
In recent years, several methods have been proposed for the discovery of causal structure from non-experimental data (Spirtes et al. 2000; Pearl 2000). Such methods make various assumptions on the data generating process to facilitate its identification from purely observational data. Continuing this line of research, we show how to discover the complete causal structure of continuous-valued data, under the assumptions that (a) the data generating process is linear, (b) there are no unobserved confounders, and (c) disturbance variables have non-gaussian distributions of non-zero variances. The solution relies on the use of the statistical method known as independent component analysis (ICA), and does not require any pre-specified time-ordering of the variables. We provide a complete Matlab package for performing this LiNGAM analysis (short for Linear Non-Gaussian Acyclic Model), and demonstrate the effectiveness of the method using artificially generated data.
Bayesian Discovery of Linear Acyclic Causal Models
Hoyer, Patrik O., Hyttinen, Antti
Methods for automated discovery of causal relationships from non-interventional data have received much attention recently. A widely used and well understood model family is given by linear acyclic causal models (recursive structural equation models). For Gaussian data both constraint-based methods (Spirtes et al., 1993; Pearl, 2000) (which output a single equivalence class) and Bayesian score-based methods (Geiger and Heckerman, 1994) (which assign relative scores to the equivalence classes) are available. On the contrary, all current methods able to utilize non-Gaussianity in the data (Shimizu et al., 2006; Hoyer et al., 2008) always return only a single graph or a single equivalence class, and so are fundamentally unable to express the degree of certainty attached to that output. In this paper we develop a Bayesian score-based approach able to take advantage of non-Gaussianity when estimating linear acyclic causal models, and we empirically demonstrate that, at least on very modest size networks, its accuracy is as good as or better than existing methods. We provide a complete code package (in R) which implements all algorithms and performs all of the analysis provided in the paper, and hope that this will further the application of these methods to solving causal inference problems.
On the Identifiability of the Post-Nonlinear Causal Model
By taking into account the nonlinear effect of the cause, the inner noise effect, and the measurement distortion effect in the observed variables, the post-nonlinear (PNL) causal model has demonstrated its excellent performance in distinguishing the cause from effect. However, its identifiability has not been properly addressed, and how to apply it in the case of more than two variables is also a problem. In this paper, we conduct a systematic investigation on its identifiability in the two-variable case. We show that this model is identifiable in most cases; by enumerating all possible situations in which the model is not identifiable, we provide sufficient conditions for its identifiability. Simulations are given to support the theoretical results. Moreover, in the case of more than two variables, we show that the whole causal structure can be found by applying the PNL causal model to each structure in the Markov equivalent class and testing if the disturbance is independent of the direct causes for each variable. In this way the exhaustive search over all possible causal structures is avoided.
Active Diagnosis via AUC Maximization: An Efficient Approach for Multiple Fault Identification in Large Scale, Noisy Networks
Bellala, Gowtham, Stanley, Jason, Scott, Clayton, Bhavnani, Suresh K.
The problem of active diagnosis arises in several applications such as disease diagnosis, and fault diagnosis in computer networks, where the goal is to rapidly identify the binary states of a set of objects (e.g., faulty or working) by sequentially selecting, and observing, (noisy) responses to binary valued queries. Current algorithms in this area rely on loopy belief propagation for active query selection. These algorithms have an exponential time complexity, making them slow and even intractable in large networks. We propose a rank-based greedy algorithm that sequentially chooses queries such that the area under the ROC curve of the rank-based output is maximized. The AUC criterion allows us to make a simplifying assumption that significantly reduces the complexity of active query selection (from exponential to near quadratic), with little or no compromise on the performance quality.
Conflict-Based Diagnosis of Discrete Event Systems: Theory and Practice
Grastien, Alban (NICTA and Australian National University) | Haslum, Patrik (Australian National University and NICTA) | Thiรฉbaux, Sylvie (Australian National University and NICTA)
We present a conflict-based approach to diagnosing Discrete Event Systems (DES) which generalises Reiter's Diagnose algorithm to a much broader class of problems. This approach obviates the need to explicitly reconstruct the system's behaviors that are consistent with the observation, as is typical of existing DES diagnosis algorithms. Instead, our algorithm explores the space of diagnosis hypotheses, testing hypotheses for consistency, and generating conflicts which rule out successors and other portions of the search space. Under relatively mild assumptions, our algorithm correctly computes the set of preferred diagnosis candidates. We investigate efficient symbolic representations of the hypotheses space and provide a SAT-based implementation of this framework which is used to address a real-world problem in processing alarms for a power transmission system.
A Theory of Abstraction for Diagnosis of Discrete-Event Systems
Grastien, Alban (NICTA and the Australian National University, Canberra) | Torta, Gianluca (Dipartimento di Informatica, Università)
We propose a theory of abstraction of discrete-event systems (DES) formulated at the semantic level, i.e., as a function that maps event traces at the original (ground) level to traces at the abstract level. We study how diagnosis of DES can be performed using an abstract model, and under which conditions this process leads to a correct solution (i.e., a set of alternative diagnoses that include the real status of the system). Finally, we study how the use of an abstract model can affect the precision of diagnosis, i.e., the presence of spurious system states in the solution. To this end, we introduce the notion of diagnosability with abstract models, which ensures the precision of abstract diagnoses, and we discuss a practical way to test it.
Reformulation for the Diagnosis of Discrete-Event Systems
Grastien, Alban (NICTA and the Australian National University, Canberra) | Torta, Gianluca (Dipartimento di Informatica, Università)
Moreover, all of the of a system and, after detection, to determine the location faults that occurred within the (possibly extended) time interval and/or the type of system faults that caused the abnormal during which the system has been observed must be behaviour. A diagnosis hypothesis indicates which fault(s) accounted for in the diagnosis. Considering again the diagnosis occurred in the system, and the diagnosis is the set of alternative of a car, for each component we could be interested hypotheses that explain (i.e., are compatible) with in knowing whether a fault has occurred to it during the last the observed system behaviour. In this paper, we focus on week; in such a case, it is difficult to perform a drastic abstraction Model-Based Diagnosis (MBD) of Discrete-Event Systems of the model without losing any precision in the (DESs, see (Cassandras and Lafortune 1999)), where the diagnosis discrimination among different hypotheses. is computed by comparing a complete DES model In this article, we study a novel approach to reduce the of the system behaviour with a (partial) observation of the complexity of DES diagnosis, based on a reformulation of actual system behaviour (Sampath et al. 1995).
Adaptable Fault Identification for Smart Buildings
Schumann, Anika (IBM Research) | Hayes, Jer (IBM Research) | Pompey, Pascal (IBM Research) | Verscheure, Olivier
Malfunctioning HVAC equipment in commercial buildings wastes between 15% and 30% of energy. Many diagnosis approaches tackle this problem, but they either suffer from a lack of detailed fault information or a lack of adaptability to different buildings and equipment. Clearly, especially in the light of an ever increasing amount of sensor data that is available in heavily metered smart buildings, easily adaptable self learning in-depth diagnosis approaches are needed. This paper addresses the challenges of developing such approaches and describes the contribution artificial intelligence techniques like transfer learning, ontologies, knowledge representation or diagnosis can make in overcoming these challenges.