Learning Graphical Models
Tracking Switched Dynamic Network Topologies from Information Cascades
Baingana, Brian, Giannakis, Georgios B.
Contagions such as the spread of popular news stories, or infectious diseases, propagate in cascades over dynamic networks with unobservable topologies. However, "social signals" such as product purchase time, or blog entry timestamps are measurable, and implicitly depend on the underlying topology, making it possible to track it over time. Interestingly, network topologies often "jump" between discrete states that may account for sudden changes in the observed signals. The present paper advocates a switched dynamic structural equation model to capture the topology-dependent cascade evolution, as well as the discrete states driving the underlying topologies. Conditions under which the proposed switched model is identifiable are established. Leveraging the edge sparsity inherent to social networks, a recursive $\ell_1$-norm regularized least-squares estimator is put forth to jointly track the states and network topologies. An efficient first-order proximal-gradient algorithm is developed to solve the resulting optimization problem. Numerical experiments on both synthetic data and real cascades measured over the span of one year are conducted, and test results corroborate the efficacy of the advocated approach.
Automatic Variational ABC
Moreno, Alexander, Adel, Tameem, Meeds, Edward, Rehg, James M., Welling, Max
Approximate Bayesian Computation (ABC) is a framework for performing likelihood-free posterior inference for simulation models. Stochastic Variational inference (SVI) is an appealing alternative to the inefficient sampling approaches commonly used in ABC. However, SVI is highly sensitive to the variance of the gradient estimators, and this problem is exacerbated by approximating the likelihood. We draw upon recent advances in variance reduction for SVI [6][13] and likelihood-free inference using deterministic simulations [12] to produce low variance gradient estimators of the variational lower-bound. By then exploiting automatic differentiation libraries [8] we can avoid nearly all model-specific derivations. We demonstrate performance on three problems and compare to existing SVI algorithms. Our results demonstrate the correctness and efficiency of our algorithm.
Expectation propagation for continuous time stochastic processes
Cseke, Botond, Schnoerr, David, Opper, Manfred, Sanguinetti, Guido
Physical and technological processes frequently exhibit intrinsic stochasticity. The main mathematical framework to describe and reason about such systems is provided by the theory of continuous time (Markovian) stochastic processes. Such processes have been well studied in chemical physics for several decades as models of chemical reactions at very low concentrations [Gardiner, 1985, e.g.]. More recently, the theory has found novel and diverse areas of application including systems biology at the single cell level [Wilkinson, 2011], ecology [Volkov et al., 2007] and performance modelling in computer systems [Hillston, 2005], to name but a few. The popularity of the approach has been greatly enhanced by the availability of efficient and accurate simulation algorithms [Gillespie, 1977, Gillespie et al., 2013], which permit a numerical solution of medium-sized systems within a reasonable time frame. As with most of science, many of the application domains of continuous time stochastic processes are becoming increasingly data-rich, creating a critical demand for inference algorithms which can use data to calibrate the models and analyse the uncertainty in the predictions. This raises new challenges and opportunities for statistics and machine learning, and has motivated the development of several algorithms for efficient inference in these systems. In this paper, we focus on the Bayesian approach, and formulate the inverse problem in terms of obtaining an approximation to a posterior distribution over the stochastic process, given observations of the system and using existing scientific information to build a prior model of the process.
History of Data Mining
Data mining is everywhere, but its story starts many years before Moneyball and Edward Snowden. The following are major milestones and "firsts" in the history of data mining plus how it's evolved and blended with data science and big data. Data mining is the computational process of exploring and uncovering patterns in large data sets a.k.a. It is fundamental to data mining and probability, since it allows understanding of complex realities based on estimated probabilities. The goal of regression analysis is to estimate the relationships among variables, and the specific method they used in this case is the method of least squares.
Imitation neurones, genuine potential
This structural design can support calculations being made upon thousands of layers, and it was this aspect of the architecture that gave rise to the name'deep learning'. Marchand-Maillet explains: "Each artificial neurone is assigned an input value, which it computes using a mathematical function, only firing if the output exceeds a pre-defined threshold." In this way, it reproduces the behaviour of real neurones, which only fire and transmit information when the input signal (the potential difference across the entire neural circuit) reaches a certain level. In the artificial model, the results of a single layer are weighted, added up and then sent as the input signal to the following layer, which processes that input using different functions, and so on and so forth. For example, if a system is trained with great quantities of photos of apples and watermelons, it will progressively learn to distinguish them on the basis of diameter, says Marchand-Maillet. If it cannot decide (e.g., when processing a picture of a tiny watermelon), the subsequent layers take over by analysing the colours or textures of the fruit in the photo, and so on.
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems): Ian H. Witten, Eibe Frank: 9780120884070: Amazon.com: Books
This book is very easy to read and understand. Unlike Hastie's Statistical Learning book, it is not geared towards those with an expert level knowledge of statistics, and instead takes time to explain functions and formulas for the person with a decent but not extrordinary understanding of statistical/math concepts. For example, their description of a Gaussian was the clearest I've seen. On the other hand, if you're math/statistics background is considerable, you may find this book somewhat simplistic or tedious. The book has a good coverage of techniques and algorithms, although I was somewhat disappointed that they do not mention Influence Diagrams, considering the amount of coverage of both decision trees and Bayesian techniques.
A Learning Algorithm for Relational Logistic Regression: Preliminary Results
Fatemi, Bahare, Kazemi, Seyed Mehran, Poole, David
Relational logistic regression (RLR) is a representation of conditional probability in terms of weighted formulae for modelling multi-relational data. In this paper, we develop a learning algorithm for RLR models. Learning an RLR model from data consists of two steps: 1- learning the set of formulae to be used in the model (a.k.a. structure learning) and learning the weight of each formula (a.k.a. parameter learning). For structure learning, we deploy Schmidt and Murphy's hierarchical assumption: first we learn a model with simple formulae, then more complex formulae are added iteratively only if all their sub-formulae have proven effective in previous learned models. For parameter learning, we convert the problem into a non-relational learning problem and use an off-the-shelf logistic regression learning algorithm from Weka, an open-source machine learning tool, to learn the weights. We also indicate how hidden features about the individuals can be incorporated into RLR to boost the learning performance. We compare our learning algorithm to other structure and parameter learning algorithms in the literature, and compare the performance of RLR models to standard logistic regression and RDN-Boost on a modified version of the MovieLens data-set.
Dynamic Hierarchical Dirichlet Process for Abnormal Behaviour Detection in Video
Isupova, Olga, Kuzin, Danil, Mihaylova, Lyudmila
This paper proposes a novel dynamic Hierarchical Dirichlet Process topic model that considers the dependence between successive observations. Conventional posterior inference algorithms for this kind of models require processing of the whole data through several passes. It is computationally intractable for massive or sequential data. We design the batch and online inference algorithms, based on the Gibbs sampling, for the proposed model. It allows to process sequential data, incrementally updating the model by a new observation. The model is applied to abnormal behaviour detection in video sequences. A new abnormality measure is proposed for decision making. The proposed method is compared with the method based on the non- dynamic Hierarchical Dirichlet Process, for which we also derive the online Gibbs sampler and the abnormality measure. The results with synthetic and real data show that the consideration of the dynamics in a topic model improves the classification performance for abnormal behaviour detection.
Anomaly detection in video with Bayesian nonparametrics
Isupova, Olga, Kuzin, Danil, Mihaylova, Lyudmila
A novel dynamic Bayesian nonparametric topic model for anomaly detection in video is proposed in this paper. Batch and online Gibbs samplers are developed for inference. The paper introduces a new abnormality measure for decision making. The proposed method is evaluated on both synthetic and real data. The comparison with a non-dynamic model shows the superiority of the proposed dynamic one in terms of the classification performance for anomaly detection.