Hoy, Darrell (Northwestern University) | Nikolova, Evdokia (University of Texas at Austin)

Mitigating risk in decision-making has been a long-standing problem. Due to the mathematical challenge of its nonlinear nature, especially in adaptive decision-making problems, finding optimal policies is typically intractable. With a focus on efficient algorithms, we ask how well we can approximate the optimal policies for the difficult case of general utility models of risk. Little is known about efficient algorithms beyond the very special cases of linear (risk-neutral) and exponential utilities since general utilities are not separable and preclude the use of traditional dynamic programming techniques. In this paper, we consider general utility functions and investigate efficient computation of approximately optimal routing policies, where the goal is to maximize the expected utility of arriving at a destination around a given deadline. We present an adaptive discretization variant of successive approximation which gives an $\error$-optimal policy in polynomial time. The main insight is to perform discretization at the utility level space, which results in a nonuniform discretization of the domain, and applies for any monotone utility function.

Many machine learning techniques can only be applied to data sets composed entireiy of nominal variabies but a very large proportion of real data sets include continuous variables. One solution to this problem is to partition numeric variables into a number of sub-ranges and treat each such sub-range as a category.

Chen, Yi-Chun, Wheeler, Tim A., Kochenderfer, Mykel J.

Learning Bayesian networks from raw data can help provide insights into the relationships between variables. While real data often contains a mixture of discrete and continuous-valued variables, many Bayesian network structure learning algorithms assume all random variables are discrete. Thus, continuous variables are often discretized when learning a Bayesian network. However, the choice of discretization policy has significant impact on the accuracy, speed, and interpretability of the resulting models. This paper introduces a principled Bayesian discretization method for continuous variables in Bayesian networks with quadratic complexity instead of the cubic complexity of other standard techniques. Empirical demonstrations show that the proposed method is superior to the established minimum description length algorithm. In addition, this paper shows how to incorporate existing methods into the structure learning process to discretize all continuous variables and simultaneously learn Bayesian network structures.

Chen, Yi-Chun, Wheeler, Tim Allan, Kochenderfer, Mykel John

Baker, Jack, Fearnhead, Paul, Fox, Emily B, Nemeth, Christopher

Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space, such as the simplex, the time-discretisation error can dominate when we are near the boundary of the space. We demonstrate that while current SGMCMC methods for the simplex perform well in certain cases, they struggle with sparse simplex spaces; when many of the components are close to zero. However, most popular large-scale applications of Bayesian inference on simplex spaces, such as network or topic models, are sparse. We argue that this poor performance is due to the biases of SGMCMC caused by the discretization error. To get around this, we propose the stochastic CIR process, which removes all discretization error and we prove that samples from the stochastic CIR process are asymptotically unbiased. Use of the stochastic CIR process within a SGMCMC algorithm is shown to give substantially better performance for a topic model and a Dirichlet process mixture model than existing SGMCMC approaches.