Genre
Implicit Density Estimation by Local Moment Matching to Sample from Auto-Encoders
Bengio, Yoshua, Alain, Guillaume, Rifai, Salah
Recent work suggests that some auto-encoder variants do a good job of capturing the local manifold structure of the unknown data generating density. This paper contributes to the mathematical understanding of this phenomenon and helps define better justified sampling algorithms for deep learning based on auto-encoder variants. We consider an MCMC where each step samples from a Gaussian whose mean and covariance matrix depend on the previous state, defines through its asymptotic distribution a target density. First, we show that good choices (in the sense of consistency) for these mean and covariance functions are the local expected value and local covariance under that target density. Then we show that an auto-encoder with a contractive penalty captures estimators of these local moments in its reconstruction function and its Jacobian. A contribution of this work is thus a novel alternative to maximum-likelihood density estimation, which we call local moment matching. It also justifies a recently proposed sampling algorithm for the Contractive Auto-Encoder and extends it to the Denoising Auto-Encoder.
Distributed Robust Power System State Estimation
Kekatos, Vassilis, Giannakis, Georgios B.
Deregulation of energy markets, penetration of renewables, advanced metering capabilities, and the urge for situational awareness, all call for system-wide power system state estimation (PSSE). Implementing a centralized estimator though is practically infeasible due to the complexity scale of an interconnection, the communication bottleneck in real-time monitoring, regional disclosure policies, and reliability issues. In this context, distributed PSSE methods are treated here under a unified and systematic framework. A novel algorithm is developed based on the alternating direction method of multipliers. It leverages existing PSSE solvers, respects privacy policies, exhibits low communication load, and its convergence to the centralized estimates is guaranteed even in the absence of local observability. Beyond the conventional least-squares based PSSE, the decentralized framework accommodates a robust state estimator. By exploiting interesting links to the compressive sampling advances, the latter jointly estimates the state and identifies corrupted measurements. The novel algorithms are numerically evaluated using the IEEE 14-, 118-bus, and a 4,200-bus benchmarks. Simulations demonstrate that the attainable accuracy can be reached within a few inter-area exchanges, while largest residual tests are outperformed.
Learning High-Dimensional Mixtures of Graphical Models
Anandkumar, A., Hsu, D., Huang, F., Kakade, S. M.
We consider unsupervised estimation of mixtures of discrete graphical models, where the class variable corresponding to the mixture components is hidden and each mixture component over the observed variables can have a potentially different Markov graph structure and parameters. We propose a novel approach for estimating the mixture components, and our output is a tree-mixture model which serves as a good approximation to the underlying graphical model mixture. Our method is efficient when the union graph, which is the union of the Markov graphs of the mixture components, has sparse vertex separators between any pair of observed variables. This includes tree mixtures and mixtures of bounded degree graphs. For such models, we prove that our method correctly recovers the union graph structure and the tree structures corresponding to maximum-likelihood tree approximations of the mixture components. The sample and computational complexities of our method scale as $\poly(p, r)$, for an $r$-component mixture of $p$-variate graphical models. We further extend our results to the case when the union graph has sparse local separators between any pair of observed variables, such as mixtures of locally tree-like graphs, and the mixture components are in the regime of correlation decay.
Single parameter galaxy classification: The Principal Curve through the multi-dimensional space of galaxy properties
Taghizadeh-Popp, M., Heinis, S., Szalay, A. S.
We propose to describe the variety of galaxies from SDSS by using only one affine parameter. To this aim, we build the Principal Curve (P-curve) passing through the spine of the data point cloud, considering the eigenspace derived from Principal Component Analysis of morphological, physical and photometric galaxy properties. Thus, galaxies can be labeled, ranked and classified by a single arc length value of the curve, measured at the unique closest projection of the data points on the P-curve. We find that the P-curve has a "W" letter shape with 3 turning points, defining 4 branches that represent distinct galaxy populations. This behavior is controlled mainly by 2 properties, namely u-r and SFR. We further present the variations of several galaxy properties as a function of arc length. Luminosity functions variate from steep Schechter fits at low arc length, to double power law and ending in Log-normal fits at high arc length. Galaxy clustering shows increasing autocorrelation power at large scales as arc length increases. PCA analysis allowed to find peculiar galaxy populations located apart from the main cloud of data points, such as small red galaxies dominated by a disk, of relatively high stellar mass-to-light ratio and surface mass density. The P-curve allows not only dimensionality reduction, but also provides supporting evidence for relevant physical models and scenarios in extragalactic astronomy: 1) Evidence for the hierarchical merging scenario in the formation of a selected group of red massive galaxies. These galaxies present a log-normal r-band luminosity function, which might arise from multiplicative processes involved in this scenario. 2) Connection between the onset of AGN activity and star formation quenching, which appears in green galaxies when transitioning from blue to red populations. (Full abstract in downloadable version)
A Hybrid Method for Distance Metric Learning
Kao, Yi-Hao, Van Roy, Benjamin, Rubin, Daniel, Xu, Jiajing, Faruque, Jessica, Napel, Sandy
We consider the problem of learning a measure of distance among vectors in a feature space and propose a hybrid method that simultaneously learns from similarity ratings assigned to pairs of vectors and class labels assigned to individual vectors. Our method is based on a generative model in which class labels can provide information that is not encoded in feature vectors but yet relates to perceived similarity between objects. Experiments with synthetic data as well as a real medical image retrieval problem demonstrate that leveraging class labels through use of our method improves retrieval performance significantly.
Software Verification and Graph Similarity for Automated Evaluation of Students' Assignments
Vujosevic-Janicic, Milena, Nikolic, Mladen, Tosic, Dusan, Kuncak, Viktor
In this paper we promote introducing software verification and control flow graph similarity measurement in automated evaluation of students' programs. We present a new grading framework that merges results obtained by combination of these two approaches with results obtained by automated testing, leading to improved quality and precision of automated grading. These two approaches are also useful in providing a comprehensible feedback that can help students to improve the quality of their programs We also present our corresponding tools that are publicly available and open source. The tools are based on LLVM low-level intermediate code representation, so they could be applied to a number of programming languages. Experimental evaluation of the proposed grading framework is performed on a corpus of university students' programs written in programming language C. Results of the experiments show that automatically generated grades are highly correlated with manually determined grades suggesting that the presented tools can find real-world applications in studying and grading.
Extension of Three-Variable Counterfactual Casual Graphic Model: from Two-Value to Three-Value Random Variable
The extension of counterfactual causal graphic model with three variables of vertex set in directed acyclic graph (DAG) is discussed in this paper by extending two- value distribution to three-value distribution of the variables involved in DAG. Using the conditional independence as ancillary information, 6 kinds of extension counterfactual causal graphic models with some variables are extended from two-value distribution to three-value distribution and the sufficient conditions of identifiability are derived.
Merging Belief Propagation and the Mean Field Approximation: A Free Energy Approach
Riegler, Erwin, Kirkelund, Gunvor Elisabeth, Manchón, Carles Navarro, Badiu, Mihai-Alin, Fleury, Bernard Henry
We present a joint message passing approach that combines belief propagation and the mean field approximation. Our analysis is based on the region-based free energy approximation method proposed by Yedidia et al. We show that the message passing fixed-point equations obtained with this combination correspond to stationary points of a constrained region-based free energy approximation. Moreover, we present a convergent implementation of these message passing fixedpoint equations provided that the underlying factor graph fulfills certain technical conditions. In addition, we show how to include hard constraints in the part of the factor graph corresponding to belief propagation. Finally, we demonstrate an application of our method to iterative channel estimation and decoding in an orthogonal frequency division multiplexing (OFDM) system.
Plan-based Policies for Efficient Multiple Battery Load Management
Fox, M., Long, D., Magazzeni, D.
Efficient use of multiple batteries is a practical problem with wide and growing application. The problem can be cast as a planning problem under uncertainty. We describe the approach we have adopted to modelling and solving this problem, seen as a Markov Decision Problem, building effective policies for battery switching in the face of stochastic load profiles. Our solution exploits and adapts several existing techniques: planning for deterministic mixed discrete-continuous problems and Monte Carlo sampling for policy learning. The paper describes the development of planning techniques to allow solution of the non-linear continuous dynamic models capturing the battery behaviours. This approach depends on carefully handled discretisation of the temporal dimension. The construction of policies is performed using a classification approach and this idea offers opportunities for wider exploitation in other problems. The approach and its generality are described in the paper. Application of the approach leads to construction of policies that, in simulation, significantly outperform those that are currently in use and the best published solutions to the battery management problem. We achieve solutions that achieve more than 99% efficiency in simulation compared with the theoretical limit and do so with far fewer battery switches than existing policies. Behaviour of physical batteries does not exactly match the simulated models for many reasons, so to confirm that our theoretical results can lead to real measured improvements in performance we also conduct and report experiments using a physical test system. These results demonstrate that we can obtain 5%-15% improvement in lifetimes in the case of a two battery system.
Elimination of Spurious Ambiguity in Transition-Based Dependency Parsing
Cohen, Shay B., Gómez-Rodríguez, Carlos, Satta, Giorgio
In parsing, spurious ambiguity refers to ambiguity in a grammar that occurs because several derivations exist for an identical syntactic analysis. When the grammar is enriched with probabilities, the existence of spurious ambiguity implies that the statistical model is defined over derivations, a more fine-grained version of the actual syntactic structures of interest. The probability of a syntactic structure then becomes the marginalized probability over all derivations that map to that syntactic structure. Spurious ambiguity can exist in various grammatical models such as combinatory categorial grammars [Steedman, 2001], tree adjoining grammars [Joshi et al., 1975], data-oriented parsing [Bod, 1992] and transition-based dependency parsing [Nivre, 2005].