Genre
RES - a Relative Method for Evidential Reasoning
An, Zhi, Bell, David A., Hughes, John G.
In this paper we describe a novel method for evidential reasoning [1]. It involves modelling the process of evidential reasoning in three steps, namely, evidence structure construction, evidence accumulation, and decision making. The proposed method, called RES, is novel in that evidence strength is associated with an evidential support relationship (an argument) between a pair of statements and such strength is carried by comparison between arguments. This is in contrast to the onventional approaches, where evidence strength is represented numerically and is associated with a statement.
Mean Field Theory of Dynamical Systems Driven by External Signals
Our understanding of non linear dynamical systems and networks has made tremendous progress during the past decades. In most cases the autonomous dynamics is studied. The situation where the network is strongly driven by an external signal has so far been less investigated even though it arises in many different contexts in the natural and artificial world. Examples include networks of interacting chemicals (proteins, RNA) in a cell driven by unpredictable external chemical signals; networks of neurons driven by an external sensory input; artificial neural networks and their applications in machine learning; the response of population dynamics and ecological networks to changes in external conditions such as the weather; the responses of stock prices to economically significant news such as a company earnings, or unemployment numbers. In all these cases taking into account the external input is essential if one wants to understand correctly the dynamics, both because the external input is often large (it cannot be treated as a small perturbation), and because in some cases the systems itself has been selected according to its response to the fluctuating and unpredictable external variables. The aim of the present work is to show, through the study of a specific but important example, how mean field techniques can provide a detailed understanding of dynamical networks strongly driven by an external signal. In the mean field approach the average feedback of the variables on themselves is taken into account through a self consistent equation, while the correlations between individual variables are neglected. The apparently extremely complicated dynamics of the network is thus reduced to much simpler evolution equations for a few collective variables. Previous applications of the mean field approach to dynamical systems (but without including an external input), and in particular neural networks, include e.g.
Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions
Abbasi-Yadkori, Yasin, Bartlett, Peter L., Szepesvari, Csaba
We study the problem of learning Markov decision processes with finite state and action spaces when the transition probability distributions and loss functions are chosen adversarially and are allowed to change with time. We introduce an algorithm whose regret with respect to any policy in a comparison class grows as the square root of the number of rounds of the game, provided the transition probabilities satisfy a uniform mixing condition. Our approach is efficient as long as the comparison class is polynomial and we can compute expectations over sample paths for each policy. Designing an efficient algorithm with small regret for the general case remains an open problem.
Linear system identification using stable spline kernels and PLQ penalties
Aravkin, Aleksandr Y., Burke, James V., Pillonetto, Gianluigi
The classical approach to linear system identification is given by parametric Prediction Error Methods (PEM). In this context, model complexity is often unknown so that a model order selection step is needed to suitably trade-off bias and variance. Recently, a different approach to linear system identification has been introduced, where model order determination is avoided by using a regularized least squares framework. In particular, the penalty term on the impulse response is defined by so called stable spline kernels. They embed information on regularity and BIBO stability, and depend on a small number of parameters which can be estimated from data. In this paper, we provide new nonsmooth formulations of the stable spline estimator. In particular, we consider linear system identification problems in a very broad context, where regularization functionals and data misfits can come from a rich set of piecewise linear quadratic functions. Moreover, our anal- ysis includes polyhedral inequality constraints on the unknown impulse response. For any formulation in this class, we show that interior point methods can be used to solve the system identification problem, with complexity O(n3)+O(mn2) in each iteration, where n and m are the number of impulse response coefficients and measurements, respectively. The usefulness of the framework is illustrated via a numerical experiment where output measurements are contaminated by outliers.
Toward Optimal Stratification for Stratified Monte-Carlo Integration
Carpentier, Alexandra, Munos, Remi
We consider the problem of adaptive stratified sampling for Monte Carlo integration of a noisy function, given a finite budget n of noisy evaluations to the function. We tackle in this paper the problem of adapting to the function at the same time the number of samples into each stratum and the partition itself. More precisely, it is interesting to refine the partition of the domain in area where the noise to the function, or where the variations of the function, are very heterogeneous. On the other hand, having a (too) refined stratification is not optimal. Indeed, the more refined the stratification, the more difficult it is to adjust the allocation of the samples to the stratification, i.e. sample more points where the noise or variations of the function are larger. We provide in this paper an algorithm that selects online, among a large class of partitions, the partition that provides the optimal trade-off, and allocates the samples almost optimally on this partition.
Variational Inference in Nonconjugate Models
Mean-field variational methods are widely used for approximate posterior inference in many probabilistic models. In a typical application, mean-field methods approximately compute the posterior with a coordinate-ascent optimization algorithm. When the model is conditionally conjugate, the coordinate updates are easily derived and in closed form. However, many models of interest---like the correlated topic model and Bayesian logistic regression---are nonconjuate. In these models, mean-field methods cannot be directly applied and practitioners have had to develop variational algorithms on a case-by-case basis. In this paper, we develop two generic methods for nonconjugate models, Laplace variational inference and delta method variational inference. Our methods have several advantages: they allow for easily derived variational algorithms with a wide class of nonconjugate models; they extend and unify some of the existing algorithms that have been derived for specific models; and they work well on real-world datasets. We studied our methods on the correlated topic model, Bayesian logistic regression, and hierarchical Bayesian logistic regression.
Fairness in Academic Course Timetabling
Mühlenthaler, Moritz, Wanka, Rolf
We consider the problem of creating fair course timetables in the setting of a university. Our motivation is to improve the overall satisfaction of individuals concerned (students, teachers, etc.) by providing a fair timetable to them. The central idea is that undesirable arrangements in the course timetable, i.e., violations of soft constraints, should be distributed in a fair way among the individuals. We propose two formulations for the fair course timetabling problem that are based on max-min fairness and Jain's fairness index, respectively. Furthermore, we present and experimentally evaluate an optimization algorithm based on simulated annealing for solving max-min fair course timetabling problems. The new contribution is concerned with measuring the energy difference between two timetables, i.e., how much worse a timetable is compared to another timetable with respect to max-min fairness. We introduce three different energy difference measures and evaluate their impact on the overall algorithm performance. The second proposed problem formulation focuses on the tradeoff between fairness and the total amount of soft constraint violations. Our experimental evaluation shows that the known best solutions to the ITC2007 curriculum-based course timetabling instances are quite fair with respect to Jain's fairness index. However, the experiments also show that the fairness can be improved further for only a rather small increase in the total amount of soft constraint violations.
Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities
Vatanen, Tommi, Raiko, Tapani, Valpola, Harri, LeCun, Yann
Recently, we proposed to transform the outputs of each hidden neuron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. We continue the work by firstly introducing a third transformation to normalize the scale of the outputs of each hidden neuron, and secondly by analyzing the connections to second order optimization methods. We show that the transformations make a simple stochastic gradient behave closer to second-order optimization methods and thus speed up learning. This is shown both in theory and with experiments. The experiments on the third transformation show that while it further increases the speed of learning, it can also hurt performance by converging to a worse local optimum, where both the inputs and outputs of many hidden neurons are close to zero.
Refinement revisited with connections to Bayes error, conditional entropy and calibrated classifiers
The concept of refinement from probability elicitation is considered for proper scoring rules. Taking directions from the axioms of probability, refinement is further clarified using a Hilbert space interpretation and reformulated into the underlying data distribution setting where connections to maximal marginal diversity and conditional entropy are considered and used to derive measures that provide arbitrarily tight bounds on the Bayes error. Refinement is also reformulated into the classifier output setting and its connections to calibrated classifiers and proper margin losses are established.
Visualizing and Interacting with Concept Hierarchies
Crampes, Michel, Plantié, Michel
Concept Hierarchies and Formal Concept Analysis are theoretically well grounded and largely experimented methods. They rely on line diagrams called Galois lattices for visualizing and analysing object-attribute sets. Galois lattices are visually seducing and conceptually rich for experts. However they present important drawbacks due to their concept oriented overall structure: analysing what they show is difficult for non experts, navigation is cumbersome, interaction is poor, and scalability is a deep bottleneck for visual interpretation even for experts. In this paper we introduce semantic probes as a means to overcome many of these problems and extend usability and application possibilities of traditional FCA visualization methods. Semantic probes are visual user centred objects which extract and organize reduced Galois sub-hierarchies. They are simpler, clearer, and they provide a better navigation support through a rich set of interaction possibilities. Since probe driven sub-hierarchies are limited to users focus, scalability is under control and interpretation is facilitated. After some successful experiments, several applications are being developed with the remaining problem of finding a compromise between simplicity and conceptual expressivity.