Goto

Collaborating Authors

 Optimization


APRICODD: Approximate Policy Construction Using Decision Diagrams

Neural Information Processing Systems

We propose a method of approximate dynamic programming for Markov decision processes (MDPs) using algebraic decision diagrams (ADDs). We produce near-optimal value functions and policies with much lower time and space requirements than exact dynamic programming. Our method reduces the sizes of the intermediate value functions generated during value iteration by replacing the values at the terminals of the ADD with ranges of values. Our method is demonstrated on a class of large MDPs (with up to 34 billion states), and we compare the results with the optimal value functions.


Constrained Independent Component Analysis

Neural Information Processing Systems

The paper presents a novel technique of constrained independent component analysis (CICA) to introduce constraints into the classical ICAand solve the constrained optimization problem by using Lagrange multiplier methods. This paper shows that CICA can be used to order the resulted independent components in a specific manner and normalize the demixing matrix in the signal separation procedure. It can systematically eliminate the ICA's indeterminacy on permutation and dilation. The experiments demonstrate the use of CICA in ordering of independent components while providing normalized demixing processes. Keywords: Independent component analysis, constrained independent componentanalysis, constrained optimization, Lagrange multiplier methods 1 Introduction Independent component analysis (ICA) is a technique to transform a multivariate randomsignal into a signal with components that are mutually independent in complete statistical sense [1]. There has been a growing interest in research for efficient realization of ICA neural networks (ICNNs).


A Linear Programming Approach to Novelty Detection

Neural Information Processing Systems

Novelty detection involves modeling the normal behaviour of a system henceenabling detection of any divergence from normality. It has potential applications in many areas such as detection of machine damageor highlighting abnormal features in medical data. One approach is to build a hypothesis estimating the support of the normal data i.e. constructing a function which is positive in the region where the data is located and negative elsewhere. Recently kernel methods have been proposed for estimating the support of a distribution and they have performed well in practice - training involves solution of a quadratic programming problem. In this paper wepropose a simpler kernel method for estimating the support based on linear programming. The method is easy to implement and can learn large datasets rapidly. We demonstrate the method on medical and fault detection datasets.


Finding the Key to a Synapse

Neural Information Processing Systems

Experimental data have shown that synapses are heterogeneous: different synapses respond with different sequences of amplitudes of postsynaptic responses to the same spike train. Neither the role of synaptic dynamics itself nor the role of the heterogeneity of synaptic dynamics for computations inneural circuits is well understood. We present in this article methods that make it feasible to compute for a given synapse with known synaptic parameters the spike train that is optimally fitted to the synapse, for example in the sense that it produces the largest sum of postsynaptic responses.To our surprise we find that most of these optimally fitted spike trains match common firing patterns of specific types of neurons that are discussed in the literature. 1 Introduction A large number of experimental studies have shown that biological synapses have an inherent dynamics,which controls how the pattern of amplitudes of postsynaptic responses depends on the temporal pattern of the incoming spike train. Various quantitative models have been proposed involving a small number of characteristic parameters, that allow us to predict the response of a given synapse to a given spike train once proper values for these characteristic synaptic parameters have been found. The analysis of this article is based on the model of [1], where three parameters U, F, D control the dynamics of a synapse and a fourth parameter A - which corresponds to the synaptic "weight" in static synapse models - scales the absolute sizes of the postsynaptic responses. The resulting model predicts theamplitude Ak for the kth spike in a spike train with interspike intervals (lSI's) .60


Tokenplan: A Planner for Both Satisfaction and Optimization Problem

AI Magazine

Tokenplan is a planner based on the use of Petri nets. Its main feature is the flexibility it offers in the way it builds the planning graph. The next step is to demonstrate the benefits we expect from our planner in planning problems involving optimization and uncertainty handling.


Planning by Rewriting

Journal of Artificial Intelligence Research

Domain-independent planning is a hard combinatorial problem. Taking into account plan quality makes the task even more difficult. This article introduces Planning by Rewriting (PbR), a new paradigm for efficient high-quality domain-independent planning. PbR exploits declarative plan-rewriting rules and efficient local search techniques to transform an easy-to-generate, but possibly suboptimal, initial plan into a high-quality plan. In addition to addressing the issues of planning efficiency and plan quality, this framework offers a new anytime planning algorithm. We have implemented this planner and applied it to several existing domains. The experimental results show that the PbR approach provides significant savings in planning effort while generating high-quality plans.


An Improved Decomposition Algorithm for Regression Support Vector Machines

Neural Information Processing Systems

The Karush-Kuhn-Tucker Theorem is used to derive conditions for determining whether or not a given working set is optimal. These conditions become the algorithm)s termination criteria) as an alternative to Osuna)s criteria (also used by Joachims without modification) which used conditions for individual points. The advantage of the new conditions is that knowledge of the hyperplane)s constant factor b) which in some cases is difficult to compute) is not required. Further investigation of the new termination conditions allows to form the strategy for selecting an optimal working set. The new algorithm is applicable to the pattern recognition SVM) and is provably equivalent to Joachims) algorithm. One can also interpret the new algorithm in the sense of the method of feasible directions. Experimental results presented in the last section demonstrate superior performance of the new method in comparison with traditional training of regression SVM. 2 General Principles of Regression SVM Decomposition The original decomposition algorithm proposed for the pattern recognition SVM in [2] has been extended to the regression SVM in [4]. For the sake of completeness I will repeat the main steps of this extension with the aim of providing terse and streamlined notation to lay the ground for working set selection.


An Improved Decomposition Algorithm for Regression Support Vector Machines

Neural Information Processing Systems

The Karush-Kuhn-Tucker Theorem is used to derive conditions for determining whether or not a given working set is optimal. These conditions become the algorithm)s termination criteria) as an alternative to Osuna)s criteria (also used by Joachims without modification) which used conditions for individual points. The advantage of the new conditions is that knowledge of the hyperplane)s constant factor b) which in some cases is difficult to compute) is not required. Further investigation of the new termination conditions allows to form the strategy for selecting an optimal working set. The new algorithm is applicable to the pattern recognition SVM) and is provably equivalent to Joachims) algorithm. One can also interpret the new algorithm in the sense of the method of feasible directions. Experimental results presented in the last section demonstrate superior performance of the new method in comparison with traditional training of regression SVM. 2 General Principles of Regression SVM Decomposition The original decomposition algorithm proposed for the pattern recognition SVM in [2] has been extended to the regression SVM in [4]. For the sake of completeness I will repeat the main steps of this extension with the aim of providing terse and streamlined notation to lay the ground for working set selection.



Semiparametric Support Vector and Linear Programming Machines

Neural Information Processing Systems

In fact, for many of the kernels used (not the polynomial kernels) like Gaussian rbf-kernels it can be shown [6] that SV machines are universal approximators. While this is advantageous in general, parametric models are useful techniques in their own right. Especially if one happens to have additional knowledge about the problem, it would be unwise not to take advantage of it. For instance it might be the case that the major properties of the data are described by a combination of a small set of linear independent basis functions {¢Jt (.),..., ¢n (.)}. Or one may want to correct the data for some (e.g.