# Markov Models

### Unsupervised learning and its role in the knowledge discovery process

Unlike supervised learning, unsupervised learning not working with labeled data, it is not showing the machine the correct answer. Instead, it is using different algorithms to let the machine create connections by studying and observing the data. Learning and improving by trial and error is the key to unsupervised learning. However, the Knowledge Discovery process is the field of data mining is concerned with the development of methods, techniques and algorithm which can make sense of the available data. It is useful in finding trends, patterns, correlations and anomalies in the databases which is helpful to make accurate decisions for the future.

### Analysis of high-dimensional Continuous Time Markov Chains using the Local Bouncy Particle Sampler

Sampling the parameters of high-dimensional Continuous Time Markov Chains (CTMC) is a challenging problem with important applications in many fields of applied statistics. In this work a recently proposed type of non-reversible rejection-free Markov Chain Monte Carlo (MCMC) sampler, the Bouncy Particle Sampler (BPS), is brought to bear to this problem. BPS has demonstrated its favorable computational efficiency compared with state-of-the-art MCMC algorithms, however to date applications to real-data scenario were scarce. An important aspect of the practical implementation of BPS is the simulation of event times. Default implementations use conservative thinning bounds. Such bounds can slow down the algorithm and limit the computational performance. Our paper develops an algorithm with an exact analytical solution to the random event times in the context of CTMCs. Our local version of BPS algorithm takes advantage of the sparse structure in the target factor graph and we also provide a framework for assessing the computational complexity of local BPS algorithms.

### Bayesian Tensor Factorisation for Bottom-up Hidden Tree Markov Models

Bottom-Up Hidden Tree Markov Model is a highly expressive model for tree-structured data. Unfortunately, it cannot be used in practice due to the intractable size of its state-transition matrix. We propose a new approximation which lies on the Tucker factorisation of tensors. The probabilistic interpretation of such approximation allows us to define a new probabilistic model for tree-structured data. Hence, we define the new approximated model and we derive its learning algorithm. Then, we empirically assess the effective power of the new model evaluating it on two different tasks. In both cases, our model outperforms the other approximated model known in the literature.

### Neural Markov Logic Networks

We introduce Neural Markov Logic Networks (NMLNs), a statistical relational learning system that borrows ideas from Markov logic. Like Markov Logic Networks (MLNs), NMLNs are an exponential-family model for modelling distributions over possible worlds, but unlike MLNs, they do not rely on explicitly specified first-order logic rules. Instead, NMLNs learn an implicit representation of such rules as a neural network that acts as a potential function on fragments of the relational structure. Interestingly, any MLN can be represented as an NMLN. Similarly to recently proposed Neural theorem provers (NTPs) [Rocktäschel and Riedel, 2017], NMLNs can exploit embeddings of constants but, unlike NTPs, NMLNs work well also in their absence. This is extremely important for predicting in settings other than the transductive one.

### AAIEA2019

The Workshop on Accelerating Artificial Intelligence for Embedded Autonomy aims at gathering researchers and practitioners in the fields of autonomy, automated reasoning, planning algorithms, and embedded systems to discuss the development of novel hardware architectures that can accelerate the wide variety of AI algorithms demanded by advanced autonomous and intelligent systems. Topics of interest include hardware architectures and design methodologies to accelerate: Applications based on deep learning, skill-level and instinctive autonomy based on deep reinforcement learning, storage and retrieval of facts in knowledge bases, logical reasoning methods such as deduction, search for classical planning algorithms and Hierarchical Task Networks (HTN), inference in probabilistic models such as Bayesian networks and probabilistic logic, planning algorithms for Markov Decision Processes (MDP), and planning algorithms for Partial Observable Markov Decision Processes (POMDP).

### A Block Diagonal Markov Model for Indoor Software-Defined Power Line Communication

A Semi-Hidden Markov Model (SHMM) for bursty error channels is defined by a state transition probability matrix $A$, a prior probability vector $\Pi$, and the state dependent output symbol error probability matrix $B$. Several processes are utilized for estimating $A$, $\Pi$ and $B$ from a given empirically obtained or simulated error sequence. However, despite placing some restrictions on the underlying Markov model structure, we still have a computationally intensive estimation procedure, especially given a large error sequence containing long burst of identical symbols. Thus, in this paper, we utilize under some moderate assumptions, a Markov model with random state transition matrix $A$ equivalent to a unique Block Diagonal Markov model with state transition matrix $\Lambda$ to model an indoor software-defined power line communication system. A computationally efficient modified Baum-Welch algorithm for estimation of $\Lambda$ given an experimentally obtained error sequence from the indoor PLC channel is utilized. Resulting Equivalent Block Diagonal Markov models assist designers to accelerate and facilitate the procedure of novel PLC systems design and evaluation.

### Privacy Amplification by Mixing and Diffusion Mechanisms

A fundamental result in differential privacy states that the privacy guarantees of a mechanism are preserved by any post-processing of its output. In this paper we investigate under what conditions stochastic post-processing can amplify the privacy of a mechanism. By interpreting post-processing as the application of a Markov operator, we first give a series of amplification results in terms of uniform mixing properties of the Markov process defined by said operator. Next we provide amplification bounds in terms of coupling arguments which can be applied in cases where uniform mixing is not available. Finally, we introduce a new family of mechanisms based on diffusion processes which are closed under post-processing, and analyze their privacy via a novel heat flow argument. As applications, we show that the rate of "privacy amplification by iteration" in Noisy SGD introduced by Feldman et al. [FOCS'18] admits an exponential improvement in the strongly convex case, and propose a simple mechanism based on the Ornstein-Uhlenbeck process which has better mean squared error than the Gaussian mechanism when releasing a bounded function of the data.

### Regret Bounds for Thompson Sampling in Restless Bandit Problems

Restless bandit problems are instances of non-stationary multi-armed bandits. These problems have been studied well from the optimization perspective, where we aim to efficiently find a near-optimal policy when system parameters are known. However, very few papers adopt a learning perspective, where the parameters are unknown. In this paper, we analyze the performance of Thompson sampling in restless bandits with unknown parameters. We consider a general policy map to define our competitor and prove an $\tilde{O}(\sqrt{T})$ Bayesian regret bound. Our competitor is flexible enough to represent various benchmarks including the best fixed action policy, the optimal policy, the Whittle index policy, or the myopic policy. We also present empirical results that support our theoretical findings.

### Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies

State-of-the-art efficient model-based Reinforcement Learning (RL) algorithms typically act by iteratively solving empirical models, i.e., by performing \emph{full-planning} on Markov Decision Processes (MDPs) built by the gathered experience. In this paper, we focus on model-based RL in the finite-state finite-horizon MDP setting and establish that exploring with \emph{greedy policies} -- act by \emph{1-step planning} -- can achieve tight minimax performance in terms of regret, $\tilde{\mathcal{O}}(\sqrt{HSAT})$. Thus, full-planning in model-based RL can be avoided altogether without any performance degradation, and, by doing so, the computational complexity decreases by a factor of $S$. The results are based on a novel analysis of real-time dynamic programming, then extended to model-based RL. Specifically, we generalize existing algorithms that perform full-planning to such that act by 1-step planning. For these generalizations, we prove regret bounds with the same rate as their full-planning counterparts.

### Strategy Synthesis in POMDPs via Game-Based Abstractions

We study synthesis problems with constraints in partially observable Markov decision processes (POMDPs), where the objective is to compute a strategy for an agent that is guaranteed to satisfy certain safety and performance specifications. Verification and strategy synthesis for POMDPs are, however, computationally intractable in general. We alleviate this difficulty by focusing on planning applications and exploiting typical structural properties of such scenarios; for instance, we assume that the agent has the ability to observe its own position inside an environment. We propose an abstraction refinement framework which turns such a POMDP model into a (fully observable) probabilistic two-player game (PG). For the obtained PGs, efficient verification and synthesis tools allow to determine strategies with optimal safety and performance measures, which approximate optimal schedulers on the POMDP. If the approximation is too coarse to satisfy the given specifications, an refinement scheme improves the computed strategies. As a running example, we use planning problems where an agent moves inside an environment with randomly moving obstacles and restricted observability. We demonstrate that the proposed method advances the state of the art by solving problems several orders-of-magnitude larger than those that can be handled by existing POMDP solvers. Furthermore, this method gives guarantees on safety constraints, which is not supported by the majority of the existing solvers.