Uncertainty
Infinite Shift-invariant Grouped Multi-task Learning for Gaussian Processes
Wang, Yuyang, Khardon, Roni, Protopapas, Pavlos
Multi-task learning leverages shared information among data sets to improve the learning performance of individual tasks. The paper applies this framework for data where each task is a phase-shifted periodic time series. In particular, we develop a novel Bayesian nonparametric model capturing a mixture of Gaussian processes where each task is a sum of a group-specific function and a component capturing individual variation, in addition to each task being phase shifted. We develop an efficient \textsc{em} algorithm to learn the parameters of the model. As a special case we obtain the Gaussian mixture model and \textsc{em} algorithm for phased-shifted periodic time series. Furthermore, we extend the proposed model by using a Dirichlet Process prior and thereby leading to an infinite mixture model that is capable of doing automatic model selection. A Variational Bayesian approach is developed for inference in this model. Experiments in regression, classification and class discovery demonstrate the performance of the proposed models using both synthetic data and real-world time series data from astrophysics. Our methods are particularly useful when the time series are sparsely and non-synchronously sampled.
A Transfer Learning Approach for Learning Temporal Nodes Bayesian Networks
Cameras, Lindsey Jennifer Fiedler (Instituto Nacional de Astrofรญsica, รptica y Electrรณnica) | Sucar, Luis Enrique (Instituto Nacional de Astrofรญsica, รptica y Electrรณnica) | Morales, Eduardo F. (Instituto Nacional de Astrofรญsica, รptica y Electrรณnica)
Situations where there is insufficient information to learn from often arise, and the process to recollect data can be expensive or in some cases take too long resulting in outdated models. Transfer learning strategies have proven to be a powerful technique to learn models from several sources when a single source does not provide enough information. In this work we present a methodology to learn a Temporal Nodes Bayesian Network by transferring knowledge from several different but related domains. Experiments based on a reference network show promising results, supporting our claim that transfer learning is a viable strategy to learn these models when scarce data is available.
Decision Tables Aggregation in Rough Sets Approximation
Chakhar, Salem (University Laval) | Pusceddu, Clara (University of Sassari)
The Dominance-based Rough Set Approach (DRSA) is an extension of Rough Sets Theory to handle multicriteria classification problems by authorizing preference-ordered attributes. The DRSA assumes the existence of a single decision table while real-world decision problems imply generally several experts with different decision tables. The objective of this paper is to propose an algorithm for the aggregation of a set of decision tables, as a first step for approximating these tables. The algorithm is illustrated using real-world data.
An Empirical Comparison of Bayesian Network Parameter Learning Algorithms for Continuous Data Streams
Ratnapinda, Parot (University of Pittsburgh) | Druzdzel, Marek J. ( University of Pittsburgh Biaลystok University of Technology Biaลystok )
We compare three approaches to learning numerical parameters of Bayesian networks from continuous data streams: (1) the EM algorithm applied to all data, (2) the EM algorithm applied to data increments, and (3) the online EM algorithm. Our results show that learning from all data at each step, whenever feasible, leads to the highest parameter accuracy and model classification accuracy. When facing computational limitations, incremental learning approaches are a reasonable alternative. Of these, online EM is reasonably fast, and similar to the incremental EM algorithm in terms of accuracy. For small data sets, incremental EM seems to lead to better accuracy. When the data size gets large, online EM tends to be more accurate.
A Temporal Logic for Planning under Uncertainty
Biscaia, Manuel (IST, Technical University of Lisbon) | Baltazar, Pedro (IST,ย Technical University of Lisbon) | Mateus, Paulo (IST,ย Technical University of Lisbon) | Nagarajan, Rajagopal (Middlesex University, London)
Dealing with uncertainty in the context of planning has been an active research subject in AI. Addressing the case when uncertainty evolves over time can be difficult. In this work, we provide a solution to this problem by proposing a temporal logic to reason about quantities and probability. For this logic, we provide a PSPACE SAT algorithm together with a complete calculus. The algorithm enables us to perform planning under uncertainty via SAT, extending a technique used for classic planning. We can show that any obtained plan will have certain properties (desired or undesired). The calculus can also be used to derive the impossibility of a plan, given a set of specifications.
A Fuzzy Logic Computational Model for Emotion Regulation Based on Gross
Soleimani, Ahmad (University of Windsor, ON) | Kobti, Ziad (University of Windsor, ON)
Emotion regulation looks into methods and strategiesthat humans use in order to control and balance theirpossible extreme levels of emotions. One importantchallenge in building a computational model of emotionsis the mainly non-quantitative nature of this problem.In this paper, we investigate a Fuzzy logic approachas a possible framework for providing the requiredqualitative and quantitative description of suchmodels. In our proposed fuzzy computational modelwhich was constructed based on Gross theory for emotionregulation, beside the fuzzy structure, it includesa learning module that enhances the model adaptivityto environmental changes through learning some relevantaspects such as patterns of eventsโ sequences. Theresults of the simulation experiments were comparedagainst a formerly presented non-fuzzy implementation.We observed that the agents in our proposed modelmanaged to cope better with changes in the environmentand exhibited smoother regulation behavior. Moreover,our model showed further consistency with the inferentialrules of Gross theory.
Horizon-Independent Optimal Prediction with Log-Loss in Exponential Families
Bartlett, Peter, Grunwald, Peter, Harremoes, Peter, Hedayati, Fares, Kotlowski, Wojciech
We study online learning under logarithmic loss with regular parametric models. Hedayati and Bartlett (2012b) showed that a Bayesian prediction strategy with Jeffreys prior and sequential normalized maximum likelihood (SNML) coincide and are optimal if and only if the latter is exchangeable, and if and only if the optimal strategy can be calculated without knowing the time horizon in advance. They put forward the question what families have exchangeable SNML strategies. This paper fully answers this open problem for one-dimensional exponential families. The exchangeability can happen only for three classes of natural exponential family distributions, namely the Gaussian, Gamma, and the Tweedie exponential family of order 3/2. Keywords: SNML Exchangeability, Exponential Family, Online Learning, Logarithmic Loss, Bayesian Strategy, Jeffreys Prior, Fisher Information1
Efficient Density Estimation via Piecewise Polynomial Approximation
Chan, Siu-On, Diakonikolas, Ilias, Servedio, Rocco A., Sun, Xiaorui
We give a highly efficient "semi-agnostic" algorithm for learning univariate probability distributions that are well approximated by piecewise polynomial density functions. Let $p$ be an arbitrary distribution over an interval $I$ which is $\tau$-close (in total variation distance) to an unknown probability distribution $q$ that is defined by an unknown partition of $I$ into $t$ intervals and $t$ unknown degree-$d$ polynomials specifying $q$ over each of the intervals. We give an algorithm that draws $\tilde{O}(t\new{(d+1)}/\eps^2)$ samples from $p$, runs in time $\poly(t,d,1/\eps)$, and with high probability outputs a piecewise polynomial hypothesis distribution $h$ that is $(O(\tau)+\eps)$-close (in total variation distance) to $p$. This sample complexity is essentially optimal; we show that even for $\tau=0$, any algorithm that learns an unknown $t$-piecewise degree-$d$ probability distribution over $I$ to accuracy $\eps$ must use $\Omega({\frac {t(d+1)} {\poly(1 + \log(d+1))}} \cdot {\frac 1 {\eps^2}})$ samples from the distribution, regardless of its running time. Our algorithm combines tools from approximation theory, uniform convergence, linear programming, and dynamic programming. We apply this general algorithm to obtain a wide range of results for many natural problems in density estimation over both continuous and discrete domains. These include state-of-the-art results for learning mixtures of log-concave distributions; mixtures of $t$-modal distributions; mixtures of Monotone Hazard Rate distributions; mixtures of Poisson Binomial Distributions; mixtures of Gaussians; and mixtures of $k$-monotone densities. Our general technique yields computationally efficient algorithms for all these problems, in many cases with provably optimal sample complexities (up to logarithmic factors) in all parameters.
Structure Discovery in Nonparametric Regression through Compositional Kernel Search
Duvenaud, David, Lloyd, James Robert, Grosse, Roger, Tenenbaum, Joshua B., Ghahramani, Zoubin
Despite its importance, choosing the structural form of the kernel in nonparametric regression remains a black art. We define a space of kernel structures which are built compositionally by adding and multiplying a small number of base kernels. We present a method for searching over this space of structures which mirrors the scientific discovery process. The learned structures can often decompose functions into interpretable components and enable long-range extrapolation on time-series datasets. Our structure search method outperforms many widely used kernels and kernel combination methods on a variety of prediction tasks.