Plotting

 Technology


Exact Phase Transitions in Random Constraint Satisfaction Problems

Journal of Artificial Intelligence Research

In this paper we propose a new type of random CSP model, called Model RB, which is a revision to the standard Model B. It is proved that phase transitions from a region where almost all problems are satisfiable to a region where almost all problems are unsatisfiable do exist for Model RB as the number of variables approaches infinity. Moreover, the critical values at which the phase transitions occur are also known exactly. By relating the hardness of Model RB to Model B, it is shown that there exist a lot of hard instances in Model RB.


Reasoning on Interval and Point-based Disjunctive Metric Constraints in Temporal Contexts

Journal of Artificial Intelligence Research

We introduce a temporal model for reasoning on disjunctive metric constraints on intervals and time points in temporal contexts. This temporal model is composed of a labeled temporal algebra and its reasoning algorithms. The labeled temporal algebra defines labeled disjunctive metric point-based constraints, where each disjunct in each input disjunctive constraint is univocally associated to a label. Reasoning algorithms manage labeled constraints, associated label lists, and sets of mutually inconsistent disjuncts. These algorithms guarantee consistency and obtain a minimal network. Additionally, constraints can be organized in a hierarchy of alternative temporal contexts. Therefore, we can reason on context-dependent disjunctive metric constraints on intervals and points. Moreover, the model is able to represent non-binary constraints, such that logical dependencies on disjuncts in constraints can be handled. The computational cost of reasoning algorithms is exponential in accordance with the underlying problem complexity, although some improvements are proposed.


Planning Graph as a (Dynamic) CSP: Exploiting EBL, DDB and other CSP Search Techniques in Graphplan

Journal of Artificial Intelligence Research

This paper reviews the connections between Graphplan's planning-graph and the dynamic constraint satisfaction problem and motivates the need for adapting CSP search techniques to the Graphplan algorithm. It then describes how explanation based learning, dependency directed backtracking, dynamic variable ordering, forward checking, sticky values and random-restart search strategies can be adapted to Graphplan. Empirical results are provided to demonstrate that these augmentations improve Graphplan's performance significantly (up to 1000x speedups) on several benchmark problems. Special attention is paid to the explanation-based learning and dependency directed backtracking techniques as they are empirically found to be most useful in improving the performance of Graphplan.


Dynamically Adapting Kernels in Support Vector Machines

Neural Information Processing Systems

The kernel-parameter is one of the few tunable parameters in Support Vectormachines, controlling the complexity of the resulting hypothesis. Its choice amounts to model selection and its value is usually found by means of a validation set. We present an algorithm whichcan automatically perform model selection with little additional computational cost and with no need of a validation set. In this procedure model selection and learning are not separate, but kernels are dynamically adjusted during the learning process to find the kernel parameter which provides the best possible upper bound on the generalisation error. Theoretical results motivating the approach and experimental results confirming its validity are presented.


Almost Linear VC Dimension Bounds for Piecewise Polynomial Networks

Neural Information Processing Systems

VitalyMaiorov Department of Mathematics Technion, Haifa 32000 Israel Ron Meir Department of Electrical Engineering Technion, Haifa 32000 Israel rmeir@dumbo.technion.ac.il Abstract We compute upper and lower bounds on the VC dimension of feedforward networks of units with piecewise polynomial activation functions.We show that if the number of layers is fixed, then the VC dimension grows as W log W, where W is the number of parameters in the network. The VC dimension is an important measure of the complexity of a class of binaryvalued functions,since it characterizes the amount of data required for learning in the PAC setting (see [BEHW89, Vap82]). In this paper, we establish upper and lower bounds on the VC dimension of a specific class of multi-layered feedforward neural networks. Let F be the class of binary-valued functions computed by a feedforward neural network with W weights and k computational (non-input) units, each with a piecewise polynomial activation function. O(W2), which would lead one to conclude that the bounds Almost Linear VC Dimension Bounds for Piecewise Polynomial Networks 191 are in fact tight up to a constant.


Dynamics of Supervised Learning with Restricted Training Sets

Neural Information Processing Systems

We study the dynamics of supervised learning in layered neural networks, inthe regime where the size p of the training set is proportional to the number N of inputs. Here the local fields are no longer described by Gaussian distributions.


Blind Separation of Filtered Sources Using State-Space Approach

Neural Information Processing Systems

In this paper we present a novel approach to multichannel blind that both mixingseparation/generalized deconvolution, assuming and demixing models are described by stable linear state-space systems. Based on the minimization of Kullback-Leibler Divergence, we develop a novel learning algorithm to train the matrices in the output equation. To estimate the state of the demixing model, we introduce a new concept, called to numerically implement the Kalman filter.hidden Referany priori knowledge of to review papers [lJ and [5J for the current state of theory and methods in the field. There are several reasons why as blind deconvolution models.


Stationarity and Stability of Autoregressive Neural Network Processes

Neural Information Processing Systems

AR-NNs are a natural generalization of the classic linear autoregressive AR(p) process (2) See, e.g., Brockwell & Davis (1987) for a comprehensive introduction into AR and ARMA (autoregressive moving average) models. F. Leisch, A. Trapletti and K. Hornik 268 One of the most central questions in linear time series theory is the stationarity of the model, i.e., whether the probabilistic structure of the series is constant over time or at least asymptotically constant (when not started in equilibrium). Surprisingly, this question has not gained much interest in the NN literature, especially there are-up to our knowledge-no results giving conditions for the stationarity of AR NN models. There are results on the stationarity of Hopfield nets (Wang & Sheng, 1996), but these nets cannot be used to estimate conditional expectations for time series prediction. The rest of this paper is organized as follows: In Section 2 we recall some results from time series analysis and Markov chain theory defining the relationship between a time series and its associated Markov chain. In Section 3 we use these results to establish that standard AR-NN models without shortcut connections are stationary. We also give conditions for AR-NN models with shortcut connections to be stationary. Section 4 examines the NN modeling of an important class of non-stationary to the appendix.time


Maximum Conditional Likelihood via Bound Maximization and the CEM Algorithm

Neural Information Processing Systems

Advantages in feature selection, robustness andlimited resource allocation have been studied. Ultimately, tasks such as regression and classification reduce to the evaluation of a conditional density. However, popularity of maximumjoint likelihood and EM techniques remains strong in part due to their elegance and convergence properties. Thus, many conditional problems are solved by first estimating joint models then conditioning them.


A Polygonal Line Algorithm for Constructing Principal Curves

Neural Information Processing Systems

Principal curves have been defined as "self consistent" smooth curves which pass through the "middle" of a d-dimensional probability distribution ordata cloud. Recently, we [1] have offered a new approach by defining principal curves as continuous curves of a given length which minimize the expected squared distance between the curve and points of the space randomly chosen according to a given distribution. The new definition made it possible to carry out a theoretical analysis of learning principal curves from training data. In this paper we propose a practical construction based on the new definition. Simulation results demonstrate that the new algorithm compares favorably with previous methods both in terms of performance and computational complexity.