Goto

Collaborating Authors

 Undirected Networks


Stochastic Normalizing Flows for Inverse Problems: a Markov Chains Viewpoint

arXiv.org Artificial Intelligence

Deep generative models for approximating complicated and often high-dimensional probability distributions became a rapidly developing research field. Normalizing flows are a popular subclass of these generative models. They can be used to model a target distribution by a simpler latent distribution which is usually the standard normal distribution. In this paper, we are interested in finite normalizing flows which are basically concatenations of learned diffeomorphisms. The parameters of the diffeomorphism are adapted to the target distribution by minimizing a loss functions. To this end, the diffeomorphism must have a tractable Jacobian determinant. For the continuous counterpart of normalizing flows, we refer to the overview paper [43] and the references therein. Suitable architectures of finite normalizing flows include invertible residual neural networks (ResNets) [7, 11, 22], (coupling-based) invertible neural networks (INNs) [4, 14, 29, 34, 40] and autoregessive flows [13, 15, 26, 38].


Conversational Agents: Theory and Applications

arXiv.org Artificial Intelligence

In this chapter, we provide a review of conversational agents (CAs), discussing chatbots, intended for casual conversation with a user, as well as task-oriented agents that generally engage in discussions intended to reach one or several specific goals, often (but not always) within a specific domain. We also consider the concept of embodied conversational agents, briefly reviewing aspects such as character animation and speech processing. The many different approaches for representing dialogue in CAs are discussed in some detail, along with methods for evaluating such agents, emphasizing the important topics of accountability and interpretability. A brief historical overview is given, followed by an extensive overview of various applications, especially in the fields of health and education. We end the chapter by discussing benefits and potential risks regarding the societal impact of current and future CA technology.


Trusted Approximate Policy Iteration with Bisimulation Metrics

arXiv.org Artificial Intelligence

Bisimulation metrics define a distance measure between states of a Markov decision process (MDP) based on a comparison of reward sequences. Due to this property they provide theoretical guarantees in value function approximation. In this work we first prove that bisimulation metrics can be defined via any $p$-Wasserstein metric for $p\geq 1$. Then we describe an approximate policy iteration (API) procedure that uses $\epsilon$-aggregation with $\pi$-bisimulation and prove performance bounds for continuous state spaces. We bound the difference between $\pi$-bisimulation metrics in terms of the change in the policies themselves. Based on these theoretical results, we design an API($\alpha$) procedure that employs conservative policy updates and enjoys better performance bounds than the naive API approach. In addition, we propose a novel trust region approach which circumvents the requirement to explicitly solve a constrained optimization problem. Finally, we provide experimental evidence of improved stability compared to non-conservative alternatives in simulated continuous control.


Free Book: Foundations of Data Science (from Microsoft Research Lab) - DataScienceCentral.com

#artificialintelligence

Computer science as an academic discipline began in the 1960s. Emphasis was on programming languages, compilers, operating systems, and the mathematical theory that supported these areas. Courses in theoretical computer science covered finite automata, regular expressions, context-free languages, and computability. In the 1970s, the study of algorithms was added as an important component of theory. The emphasis was on making computers useful.


De-Sequentialized Monte Carlo: a parallel-in-time particle smoother

arXiv.org Machine Learning

Particle smoothers are SMC (Sequential Monte Carlo) algorithms designed to approximate the joint distribution of the states given observations from a state-space model. We propose dSMC (de-Sequentialized Monte Carlo), a new particle smoother that is able to process $T$ observations in $\mathcal{O}(\log T)$ time on parallel architecture. This compares favourably with standard particle smoothers, the complexity of which is linear in $T$. We derive $\mathcal{L}_p$ convergence results for dSMC, with an explicit upper bound, polynomial in $T$. We then discuss how to reduce the variance of the smoothing estimates computed by dSMC by (i) designing good proposal distributions for sampling the particles at the initialization of the algorithm, as well as by (ii) using lazy resampling to increase the number of particles used in dSMC. Finally, we design a particle Gibbs sampler based on dSMC, which is able to perform parameter inference in a state-space model at a $\mathcal{O}(\log(T))$ cost on parallel hardware.


Just Another Method to Compute MTTF from Continuous Time Markov Chain

arXiv.org Artificial Intelligence

The Meantime To Failure (MTTF) is a statistic used for system analysis in several knowledge areas. This value represents the average time to the system enters into one of the possible states of fault, without considering system repairs. Although MTTF be considered to analyze systems with fault states, it also can be used to perform analysis on processes, since it can be used to represent the meantime to one process finishes, given that, processes can be represented by state machine models. This work presents a method to compute MTTF from Continuous Time Markov Chain (CTMC) models. There are no arguments that demonstrate that this method performs better than other methods, but this method has a simpler implementation and is intuitive. This method also allows computing the absorption probabilities and the average holding time of each state without additional steps.


Generative Flow Networks for Discrete Probabilistic Modeling

arXiv.org Machine Learning

We present energy-based generative flow networks (EB-GFN), a novel probabilistic modeling algorithm for high-dimensional discrete data. Building upon the theory of generative flow networks (GFlowNets), we model the generation process by a stochastic data construction policy and thus amortize expensive MCMC exploration into a fixed number of actions sampled from a GFlowNet. We show how GFlowNets can approximately perform large-block Gibbs sampling to mix between modes. We propose a framework to jointly train a GFlowNet with an energy function, so that the GFlowNet learns to sample from the energy distribution, while the energy learns with an approximate MLE objective with negative samples from the GFlowNet. We demonstrate EB-GFN's effectiveness on various probabilistic modeling tasks.


Fenrir: Physics-Enhanced Regression for Initial Value Problems

arXiv.org Machine Learning

We show how probabilistic numerics can be used to convert an initial value problem into a Gauss--Markov process parametrised by the dynamics of the initial value problem. Consequently, the often difficult problem of parameter estimation in ordinary differential equations is reduced to hyperparameter estimation in Gauss--Markov regression, which tends to be considerably easier. The method's relation and benefits in comparison to classical numerical integration and gradient matching approaches is elucidated. In particular, the method can, in contrast to gradient matching, handle partial observations, and has certain routes for escaping local optima not available to classical numerical integration. Experimental results demonstrate that the method is on par or moderately better than competing approaches.


AdaAnn: Adaptive Annealing Scheduler for Probability Density Approximation

arXiv.org Machine Learning

Approximating probability distributions can be a challenging task, particularly when they are supported over regions of high geometrical complexity or exhibit multiple modes. Annealing can be used to facilitate this task which is often combined with constant a priori selected increments in inverse temperature. However, using constant increments limit the computational efficiency due to the inability to adapt to situations where smooth changes in the annealed density could be handled equally well with larger increments. We introduce AdaAnn, an adaptive annealing scheduler that automatically adjusts the temperature increments based on the expected change in the Kullback-Leibler divergence between two distributions with a sufficiently close annealing temperature. AdaAnn is easy to implement and can be integrated into existing sampling approaches such as normalizing flows for variational inference and Markov chain Monte Carlo. We demonstrate the computational efficiency of the AdaAnn scheduler for variational inference with normalizing flows on a number of examples, including density approximation and parameter estimation for dynamical systems.


Efficient Algorithms for Learning to Control Bandits with Unobserved Contexts

arXiv.org Machine Learning

Contextual bandits are commonly used for sequential decision-making with finitely many control actions. In this setting, available context observations can be utilized in a tractable way, thanks to the linearity of the relationship between the reward and the context vectors. The arms provide rewards depending on the contexts that represent their individual characteristics. The range of real-world applications is notably extensive, including personalized recommendations for Mobile Context-Aware Recommender Systems and mobile-health interventions [1, 2, 3]. To get satisfactory performances in bandits, the exploration-exploitation trade-off must be addressed. The theoretical analysis of efficient policies for the multi-armed bandits goes back to algorithms that decide based on Upper-Confident-Bounds (UCB) [4]. In fact, UCB employs an optimistic approximate of the unknown reward based on the history of observations, to allow an appropriate degree of exploration. Further theoretical results for UCB in contextual bandits, as well as in other settings, are available in the literature [5, 6, 7, 8, 9]. Posterior sampling is another ubiquitous reinforcement learning algorithm that effectively balances exploitation versus exploration.