Plotting

 Technology


Improved Use of Continuous Attributes in C4.5

Journal of Artificial Intelligence Research

A reported weakness of C4.5 in domains with continuous attributes is addressed by modifying the formation and evaluation of tests on continuous attributes. An MDL-inspired penalty is applied to such tests, eliminating some of them from consideration and altering the relative desirability of all tests. Empirical trials show that the modifications lead to smaller decision trees with higher predictive accuracies. Results also confirm that a new version of C4.5 incorporating these changes is superior to recent approaches that use global discretization and that construct small trees with multi-interval splits.


Mean Field Theory for Sigmoid Belief Networks

Journal of Artificial Intelligence Research

We develop a mean field theory for sigmoid belief networks based on ideas from statistical mechanics. Our mean field theory provides a tractable approximation to the true probability distribution in these networks; it also yields a lower bound on the likelihood of evidence. We demonstrate the utility of this framework on a benchmark problem in statistical pattern recognition---the classification of handwritten digits.


Logarithmic-Time Updates and Queries in Probabilistic Networks

Journal of Artificial Intelligence Research

Traditional databases commonly support efficient query and update procedures that operate in time which is sublinear in the size of the database. Our goal in this paper is to take a first step toward dynamic reasoning in probabilistic databases with comparable efficiency. We propose a dynamic data structure that supports efficient algorithms for updating and querying singly connected Bayesian networks. In the conventional algorithm, new evidence is absorbed in O(1) time and queries are processed in time O(N), where N is the size of the network. We propose an algorithm which, after a preprocessing phase, allows us to answer queries in time O(log N) at the expense of O(log N) time per evidence absorption. The usefulness of sub-linear processing time manifests itself in applications requiring (near) real-time response over large probabilistic databases. We briefly discuss a potential application of dynamic probabilistic reasoning in computational biology.


Well-Founded Semantics for Extended Logic Programs with Dynamic Preferences

Journal of Artificial Intelligence Research

The paper describes an extension of well-founded semantics for logic programs with two types of negation. In this extension information about preferences between rules can be expressed in the logical language and derived dynamically. This is achieved by using a reserved predicate symbol and a naming technique. Conflicts among rules are resolved whenever possible on the basis of derived preference information. The well-founded conclusions of prioritized logic programs can be computed in polynomial time. A legal reasoning example illustrates the usefulness of the approach.


The Design and Experimental Analysis of Algorithms for Temporal Reasoning

Journal of Artificial Intelligence Research

Many applications -- from planning and scheduling to problems in molecular biology -- rely heavily on a temporal reasoning component. In this paper, we discuss the design and empirical analysis of algorithms for a temporal reasoning system based on Allen's influential interval-based framework for representing temporal information. At the core of the system are algorithms for determining whether the temporal information is consistent, and, if so, finding one or more scenarios that are consistent with the temporal information. Two important algorithms for these tasks are a path consistency algorithm and a backtracking algorithm. For the path consistency algorithm, we develop techniques that can result in up to a ten-fold speedup over an already highly optimized implementation. For the backtracking algorithm, we develop variable and value ordering heuristics that are shown empirically to dramatically improve the performance of the algorithm. As well, we show that a previously suggested reformulation of the backtracking search problem can reduce the time and space requirements of the backtracking search. Taken together, the techniques we develop allow a temporal reasoning component to solve problems that are of practical size.


Effects of Noise on Convergence and Generalization in Recurrent Networks

Neural Information Processing Systems

We introduce and study methods of inserting synaptic noise into dynamically-driven recurrent neural networks and show that applying acontrolled amount of noise during training may improve convergence and generalization. In addition, we analyze the effects of each noise parameter (additive vs. multiplicative, cumulative vs. non-cumulative, per time step vs. per string) and predict that best overall performance can be achieved by injecting additive noise at each time step. Extensive simulations on learning the dual parity grammar from temporal strings substantiate these predictions.


Learning Many Related Tasks at the Same Time with Backpropagation

Neural Information Processing Systems

Hinton [6] proposed that generalization in artificial neural nets should improve if nets learn to represent the domain's underlying regularities. Abu-Mustafa's hints work [1] shows that the outputs of a backprop net can be used as inputs through which domainspecific informationcan be given to the net. We extend these ideas by showing that a backprop net learning many related tasks at the same time can use these tasks as inductive bias for each other and thus learn better. We identify five mechanisms by which multitask backprop improves generalization and give empirical evidence that multitask backprop generalizes better in real domains.


A Rapid Graph-based Method for Arbitrary Transformation-Invariant Pattern Classification

Neural Information Processing Systems

We present a graph-based method for rapid, accurate search through prototypes for transformation-invariant pattern classification. Ourmethod has in theory the same recognition accuracy as other recent methods based on ''tangent distance" [Simard et al., 1994], since it uses the same categorization rule. Nevertheless ours is significantly faster during classification because far fewer tangent distancesneed be computed. Crucial to the success of our system are 1) a novel graph architecture in which transformation constraints and geometric relationships among prototypes are encoded duringlearning, and 2) an improved graph search criterion, used during classification. These architectural insights are applicable toa wide range of problem domains.



An experimental comparison of recurrent neural networks

Neural Information Processing Systems

Many different discrete-time recurrent neural network architectures havebeen proposed. However, there has been virtually no effort to compare these arch:tectures experimentally. In this paper we review and categorize many of these architectures and compare how they perform on various classes of simple problems including grammatical inference and nonlinear system identification.