Goto

Collaborating Authors

 Uncertainty


GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models

arXiv.org Artificial Intelligence

Modeling and generating graphs is fundamental for studying networks in biology, engineering, and social sciences. However, modeling complex distributions over graphs and then efficiently sampling from these distributions is challenging due to the non-unique, high-dimensional nature of graphs and the complex, non-local dependencies that exist between edges in a given graph. Here we propose GraphRNN, a deep autoregressive model that addresses the above challenges and approximates any distribution of graphs with minimal assumptions about their structure. GraphRNN learns to generate graphs by training on a representative set of graphs and decomposes the graph generation process into a sequence of node and edge formations, conditioned on the graph structure generated so far. In order to quantitatively evaluate the performance of GraphRNN, we introduce a benchmark suite of datasets, baselines and novel evaluation metrics based on Maximum Mean Discrepancy, which measure distances between sets of graphs. Our experiments show that GraphRNN significantly outperforms all baselines, learning to generate diverse graphs that match the structural characteristics of a target set, while also scaling to graphs 50 times larger than previous deep models.


Learning Traffic Flow Dynamics using Random Fields

arXiv.org Machine Learning

This paper presents a mesoscopic stochastic model for the reconstruction of vehicle trajectories from data made available by subsets of (probe) vehicles. Long-range vehicle interactions are applied in a totally asymmetric simple exclusion process to capture information made available to connected and autonomous vehicles. The dynamics are represented by a factor graph, which enables learning of traffic dynamics from historical data using Bayesian belief propagation. Adequate probe penetration levels for faithful reconstruction on single-lane roads is investigated. The estimation technique is tested using a vehicle trajectory dataset generated using an independent microscopic traffic simulator. Although the parameters of the traffic state estimation model are learned from (simulated) historical data, the proposed algorithm is found to be robust to unpredictable conditions. Moreover, by exposing the algorithm to varying traffic conditions with increasingly larger datasets, the probe penetration rates required to capture the traffic dynamics effectively can be substantially reduced. The results also highlight the need to take into account randomness in the spatio-temporal coverage associated with probe data for reliable state estimation algorithms.


Tensor Monte Carlo: particle methods for the GPU era

arXiv.org Machine Learning

Multi-sample objectives improve over single-sample estimates by giving tighter variational bounds and more accurate estimates of posterior uncertainty. However, these multi-sample techniques scale poorly, in the sense that the number of samples required to maintain the same quality of posterior approximation scales exponentially in the number of latent dimensions. One approach to addressing these issues is sequential Monte Carlo (SMC). However for many problems SMC is prohibitively slow because the resampling steps imposes an inherently sequential structure on the computation, which is difficult to effectively parallelise on GPU hardware. We developed tensor Monte-Carlo to address these issues. In particular, whereas the usual multi-sample objective draws $K$ samples from a joint distribution over all latent variables, we draw $K$ samples for each of the $n$ individual latent variables, and form our bound by averaging over all $K^n$ combinations of samples from each individual latent. While this sum over exponentially many terms might seem to be intractable, in many cases it can be efficiently computed by exploiting conditional independence structure. In particular, we generalise and simplify classical algorithms such as message passing by noting that these sums can be computed can be written in an extremely simple, general form: a series of tensor inner-products which can be depicted graphically as reductions of a factor graph. As such, we can straightforwardly combine summation over discrete variables with importance sampling over importance sampling over continuous variables.


A data-driven model order reduction approach for Stokes flow through random porous media

arXiv.org Machine Learning

Direct numerical simulation of Stokes flow through an impermeable, rigid body matrix by finite elements requires meshes fine enough to resolve the pore-size scale and is thus a computationally expensive task. The cost is significantly amplified when randomness in the pore microstructure is present and therefore multiple simulations need to be carried out. It is well known that in the limit of scale-separation, Stokes flow can be accurately approximated by Darcy's law with an effective diffusivity field depending on viscosity and the pore-matrix topology. We propose a fully probabilistic, Darcy-type, reduced-order model which, based on only a few tens of full-order Stokes model runs, is capable of learning a map from the fine-scale topology to the effective diffusivity and is maximally predictive of the fine-scale response. The reduced-order model learned can significantly accelerate uncertainty quantification tasks as well as provide quantitative confidence metrics of the predictive estimates produced.


Expanding the Active Inference Landscape: More Intrinsic Motivations in the Perception-Action Loop

arXiv.org Artificial Intelligence

Active inference is an ambitious theory that treats perception, inference and action selection of autonomous agents under the heading of a single principle. It suggests biologically plausible explanations for many cognitive phenomena, including consciousness. In active inference, action selection is driven by an objective function that evaluates possible future actions with respect to current, inferred beliefs about the world. Active inference at its core is independent from extrinsic rewards, resulting in a high level of robustness across e.g.\ different environments or agent morphologies. In the literature, paradigms that share this independence have been summarised under the notion of intrinsic motivations. In general and in contrast to active inference, these models of motivation come without a commitment to particular inference and action selection mechanisms. In this article, we study if the inference and action selection machinery of active inference can also be used by alternatives to the originally included intrinsic motivation. The perception-action loop explicitly relates inference and action selection to the environment and agent memory, and is consequently used as foundation for our analysis. We reconstruct the active inference approach, locate the original formulation within, and show how alternative intrinsic motivations can be used while keeping many of the original features intact. Furthermore, we illustrate the connection to universal reinforcement learning by means of our formalism. Active inference research may profit from comparisons of the dynamics induced by alternative intrinsic motivations. Research on intrinsic motivations may profit from an additional way to implement intrinsically motivated agents that also share the biological plausibility of active inference.


Probabilistic PARAFAC2

arXiv.org Machine Learning

The PARAFAC2 is a multimodal factor analysis model suitable for analyzing multi-way data when one of the modes has incomparable observation units, for example because of differences in signal sampling or batch sizes. A fully probabilistic treatment of the PARAFAC2 is desirable in order to improve robustness to noise and provide a well founded principle for determining the number of factors, but challenging because the factor loadings are constrained to be orthogonal. We develop two probabilistic formulations of the PARAFAC2 along with variational procedures for inference: In the one approach, the mean values of the factor loadings are orthogonal leading to closed form variational updates, and in the other, the factor loadings themselves are orthogonal using a matrix Von Mises-Fisher distribution. We contrast our probabilistic formulation to the conventional direct fitting algorithm based on maximum likelihood. On simulated data and real fluorescence spectroscopy and gas chromatography-mass spectrometry data, we compare our approach to the conventional PARAFAC2 model estimation and find that the probabilistic formulation is more robust to noise and model order misspecification. The probabilistic PARAFAC2 thus forms a promising framework for modeling multi-way data accounting for uncertainty.


Neural-net-induced Gaussian process regression for function approximation and PDE solution

arXiv.org Machine Learning

Neural-net-induced Gaussian process (NNGP) regression inherits both the high expressivity of deep neural networks (deep NNs) as well as the uncertainty quantification property of Gaussian processes (GPs). We generalize the current NNGP to first include a larger number of hyperparameters and subsequently train the model by maximum likelihood estimation. Unlike previous works on NNGP that targeted classification, here we apply the generalized NNGP to function approximation and to solving partial differential equations (PDEs). Specifically, we develop an analytical iteration formula to compute the covariance function of GP induced by deep NN with an error-function nonlinearity. We compare the performance of the generalized NNGP for function approximations and PDE solutions with those of GPs and fully-connected NNs. We observe that for smooth functions the generalized NNGP can yield the same order of accuracy with GP, while both NNGP and GP outperform deep NN. For non-smooth functions, the generalized NNGP is superior to GP and comparable or superior to deep NN.


Companies involved in AI or ML

#artificialintelligence

AppZen โ€“ uses artificial intelligence to automate expense report audit. ArgyleData โ€“ is a software maker that uses big data and machine learning to detect and stop fraud for telcom companies. Also see FraudTechWire.com Attrasoft โ€“ Provider of a number of neural network based products for image and sound recognition/retrieval, trend prediction and data mining. Acquired Intelligence Inc โ€“ Creators of the ACQUIRE line of administration, operations and customer support products in stand-alone or web-based applications. Includes profile, demo downloads, and job openings.


Compiling Probabilistic Model Checking into Probabilistic Planning

AAAI Conferences

It has previously been observed that the verification of safety properties in deterministic model-checking frameworks can be compiled into classical planning. A similar connection exists between goal probability analysis on either side, yet that connection has not been explored. We fill that gap with a translation from Jani, an input language for quantitative model checkers including the Modest toolset and PRISM, into PPDDL. Our experiments motivate further cross-fertilization between both research areas, specifically the exchange of algorithms. Our study also initiates the creation of new benchmarks for goal probability analysis.


Random Feature Stein Discrepancies

arXiv.org Machine Learning

Computable Stein discrepancies have been deployed for a variety of applications, including sampler selection in posterior inference, approximate Bayesian inference, and goodness-of-fit testing. Existing convergence-determining Stein discrepancies admit strong theoretical guarantees but suffer from a computational cost that grows quadratically in the sample size. While linear-time Stein discrepancies have been proposed for goodness-of-fit testing, they exhibit avoidable degradations in testing power---even when power is explicitly optimized. To address these shortcomings, we introduce feature Stein discrepancies ($\Phi$SDs), a new family of quality measures that can be cheaply approximated using importance sampling. We show how to construct $\Phi$SDs that provably determine the convergence of a sample to its target and develop high-accuracy approximations---random $\Phi$SDs (R$\Phi$SDs)---which are computable in near-linear time. In our experiments with sampler selection for approximate posterior inference and goodness-of-fit testing, R$\Phi$SDs typically perform as well or better than quadratic-time KSDs while being orders of magnitude faster to compute.