Country
Robust Feature Selection by Mutual Information Distributions
Zaffalon, Marco, Hutter, Marcus
Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must consider sample-to-population inferential approaches. This paper deals with the distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean and an analytical approximation of the variance are reported. Asymptotic approximations of the distribution are proposed. The results are applied to the problem of selecting features for incremental learning and classification of the naive Bayes classifier. A fast, newly defined method is shown to outperform the traditional approach based on empirical mutual information on a number of real data sets. Finally, a theoretical development is reported that allows one to efficiently extend the above methods to incomplete samples in an easy and effective way.
SMOTE: Synthetic Minority Over-sampling Technique
Chawla, N. V., Bowyer, K. W., Hall, L. O., Kegelmeyer, W. P.
An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of ``normal'' examples with only a small percentage of ``abnormal'' or ``interesting'' examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
Extensions of Simple Conceptual Graphs: the Complexity of Rules and Constraints
Simple conceptual graphs are considered as the kernel of most knowledge representation formalisms built upon Sowa's model. Reasoning in this model can be expressed by a graph homomorphism called projection, whose semantics is usually given in terms of positive, conjunctive, existential FOL. We present here a family of extensions of this model, based on rules and constraints, keeping graph homomorphism as the basic operation. We focus on the formal definitions of the different models obtained, including their operational semantics and relationships with FOL, and we analyze the decidability and complexity of the associated problems (consistency and deduction). As soon as rules are involved in reasonings, these problems are not decidable, but we exhibit a condition under which they fall in the polynomial hierarchy. These results extend and complete the ones already published by the authors. Moreover we systematically study the complexity of some particular cases obtained by restricting the form of constraints and/or rules.
Efficient Reinforcement Learning Using Recursive Least-Squares Methods
The recursive least-squares (RLS) algorithm is one of the most well-known algorithms used in adaptive filtering, system identification and adaptive control. Its popularity is mainly due to its fast convergence speed, which is considered to be optimal in practice. In this paper, RLS methods are used to solve reinforcement learning problems, where two new reinforcement learning algorithms using linear value function approximators are proposed and analyzed. The two algorithms are called RLS-TD(lambda) and Fast-AHC (Fast Adaptive Heuristic Critic), respectively. RLS-TD(lambda) can be viewed as the extension of RLS-TD(0) from lambda=0 to general lambda within interval [0,1], so it is a multi-step temporal-difference (TD) learning algorithm using RLS methods. The convergence with probability one and the limit of convergence of RLS-TD(lambda) are proved for ergodic Markov chains. Compared to the existing LS-TD(lambda) algorithm, RLS-TD(lambda) has advantages in computation and is more suitable for online learning. The effectiveness of RLS-TD(lambda) is analyzed and verified by learning prediction experiments of Markov chains with a wide range of parameter settings. The Fast-AHC algorithm is derived by applying the proposed RLS-TD(lambda) algorithm in the critic network of the adaptive heuristic critic method. Unlike conventional AHC algorithm, Fast-AHC makes use of RLS methods to improve the learning-prediction efficiency in the critic. Learning control experiments of the cart-pole balancing and the acrobot swing-up problems are conducted to compare the data efficiency of Fast-AHC with conventional AHC. From the experimental results, it is shown that the data efficiency of learning control can also be improved by using RLS methods in the learning-prediction process of the critic. The performance of Fast-AHC is also compared with that of the AHC method using LS-TD(lambda). Furthermore, it is demonstrated in the experiments that different initial values of the variance matrix in RLS-TD(lambda) are required to get better performance not only in learning prediction but also in learning control. The experimental results are analyzed based on the existing theoretical work on the transient phase of forgetting factor RLS methods.
Structured Knowledge Representation for Image Retrieval
Di Sciascio, E., Donini, F. M., Mongiello, M.
We propose a structured approach to the problem of retrieval of images by content and present a description logic that has been devised for the semantic indexing and retrieval of images containing complex objects. As other approaches do, we start from low-level features extracted with image analysis to detect and characterize regions in an image. However, in contrast with feature-based approaches, we provide a syntax to describe segmented regions as basic objects and complex objects as compositions of basic ones. Then we introduce a companion extensional semantics for defining reasoning services, such as retrieval, classification, and subsumption. These services can be used for both exact and approximate matching, using similarity measures. Using our logical approach as a formal specification, we implemented a complete client-server image retrieval system, which allows a user to pose both queries by sketch and queries by example. A set of experiments has been carried out on a testbed of images to assess the retrieval capabilities of the system in comparison with expert users ranking. Results are presented adopting a well-established measure of quality borrowed from textual information retrieval.
Entropy estimation of symbol sequences
Schรผrmann, Thomas, Grassberger, Peter
We discuss algorithms for estimating the Shannon entropy h of finite symbol sequences with long range correlations. In particular, we consider algorithms which estimate h from the code lengths produced by some compression algorithm. Our interest is in describing their convergence with sequence length, assuming no limits for the space and time complexities of the compression algorithms. A scaling law is proposed for extrapolation from finite sample lengths. This is applied to sequences of dynamical systems in non-trivial chaotic regimes, a 1-D cellular automaton, and to written English texts.
AI in the News
Recent advances in artificial intelligence, cognitive science, neuroscience and robotics have stimulated the birth and growth of a new research field, known as computational autonomous mental development. Although human mental development is a wellknown subject of study, e.g., in developmental psychology, computational studies of A brief discussion of the subject is available in an article appeared in Science, available electronically at: http://www.cse.msu.edu/dl/SciencePaper.pdf. More than 11.2 Man and Machine Take the Field, red eye, behind which lurked an unfathomable million trucks entered the United David Olson, The Seattle Times (August intelligence. "The soccer games may be the computer has become more alien Service. But help is at hand, just for fun, but the technology used still. Kubrick had no inkling of the thanks to a maturing computer technology to create and operate the players could networked computer, with its potential with the capability of finding also help build robots to rescue victims for massively distributed intelligence. Known as machine learning president of the RoboCup Federation." of the grander design, just as a or neural networks, such technology Robots Scour WTC Wreckage, Leander single ant has no comprehension of uses the power of computer processing Kahney, Wired News (September the intelligence of the colony.
AAAI News
The AAAI Press - Distributed by The MIT Press Massachusetts Institute of Technology, 5 Cambridge Center, Cambridge, Massachusetts 02142 To order, call toll free: (800) 356-0343 or (617) 625-8569. SPRING 2002 5 first time that AAAI's National conference has been held in Canada--a In addition, the program chairs are experimenting with a new format for AAAI.