Goto

Collaborating Authors

 South America


On Combining Machine Learning with Decision Making

arXiv.org Machine Learning

Mach Learn manuscript No. (will be inserted by the editor) Abstract We present a new application and covering number bound for the framework of "Machine Learning with Operational Costs (MLOC)," which is an exploratory form of decision theory. The MLOC framework incorporates knowledge about how a predictive model will be used for a subsequent task, thus combining machine learning with the decision that is made afterwards. In this work, we use the MLOC framework to study a problem that has implications for power grid reliability and maintenance, called the Machine Learning and Traveling Repairman Problem (ML&TRP). The goal of the ML&TRP is to determine a route for a "repair crew," which repairs nodes on a graph. The repair crew aims to minimize the cost of failures at the nodes, but as in many real situations, the failure probabilities are not known and must be estimated. The MLOC framework allows us to understand how this uncertainty influences the repair route. Keywords decision theory · generalization bound · constrained linear function classes · covering numbers · traveling repairman · mixed-integer programming 1 Introduction In many domains, it is essential to understand how uncertainty in predictions influences decision-making. Funding for Theja Tulabandhula was provided by a Fulbright Fellowship and Xerox Fellowship. Cynthia Rudin's work on this project was funded in part by Con Edison, by the MIT Energy Initiative Seed Fund, and NSF grant IIS-1053407. The new framework of Machine Learning with Operational Costs (MLOC) (Tulabandhula and Rudin, 2013) provides a mechanism to do this, and is a type of exploratory decision theory. Where usual decision theories provide a single policy that minimizes expected costs, the MLOC framework is able to produce a range of reasonable policies that span the full set of reasonable costs. To do this, the operational cost becomes a regularization term within the machine learning model, and adjusting the regularization constant allows us to explore solutions for all reasonable costs. This gives decision makers a way to understand the uncertainty in their predictive model in terms of something they can grasp - uncertainty in the cost to solve the problem. The MLOC framework can also be used in another way, namely to incorporate prior knowledge about the cost to produce a better predictive model.


Logics of formal inconsistency arising from systems of fuzzy logic

arXiv.org Artificial Intelligence

This paper proposes the meeting of fuzzy logic with paraconsistency in a very precise and foundational way. Specifically, in this paper we introduce expansions of the fuzzy logic MTL by means of primitive operators for consistency and inconsistency in the style of the so-called Logics of Formal Inconsistency (LFIs). The main novelty of the present approach is the definition of postulates for this type of operators over MTL-algebras, leading to the definition and axiomatization of a family of logics, expansions of MTL, whose degree-preserving counterpart are paraconsistent and moreover LFIs.


Support Vector Machine Model for Currency Crisis Discrimination

arXiv.org Machine Learning

Support Vector Machine (SVM) is powerful classification technique based on the idea of structural risk minimization. Use of kernel function enables curse of dimensionality to be addressed. However, proper kernel function for certain problem is dependent on specific dataset and as such there is no good method on choice of kernel function. In this paper, SVM is used to build empirical models of currency crisis in Argentina. An estimation technique is developed by training model on real life data set which provides reasonably accurate model outputs and helps policy makers to identify situations in which currency crisis may happen. The third and fourth order polynomial kernel is generally best choice to achieve high generalization of classifier performance. SVM has high level of maturity with algorithms that are simple, easy to implement, tolerates curse of dimensionality and good empirical performance. The satisfactory results show that currency crisis situation is properly emulated using only small fraction of database and could be used as an evaluation tool as well as an early warning system. To the best of knowledge this is the first work on SVM approach for currency crisis evaluation of Argentina.


Multimodal Distributional Semantics

Journal of Artificial Intelligence Research

Distributional semantic models derive computational representations of word meaning from the patterns of co-occurrence of words in text. Such models have been a success story of computational linguistics, being able to provide reliable estimates of semantic relatedness for the many semantic tasks requiring them. However, distributional models extract meaning information exclusively from text, which is an extremely impoverished basis compared to the rich perceptual sources that ground human semantic knowledge. We address the lack of perceptual grounding of distributional models by exploiting computer vision techniques that automatically identify discrete visual words in images, so that the distributional representation of a word can be extended to also encompass its co-occurrence with the visual words of images it is associated with. We propose a flexible architecture to integrate text- and image-based distributional information, and we show in a set of empirical tests that our integrated model is superior to the purely text-based approach, and it provides somewhat complementary semantic information with respect to the latter.


Multiclass Data Segmentation using Diffuse Interface Methods on Graphs

arXiv.org Machine Learning

We present two graph-based algorithms for multiclass segmentation of high-dimensional data. The algorithms use a diffuse interface model based on the Ginzburg-Landau functional, related to total variation compressed sensing and image processing. A multiclass extension is introduced using the Gibbs simplex, with the functional's double-well potential modified to handle the multiclass case. The first algorithm minimizes the functional using a convex splitting numerical scheme. The second algorithm is a uses a graph adaptation of the classical numerical Merriman-Bence-Osher (MBO) scheme, which alternates between diffusion and thresholding. We demonstrate the performance of both algorithms experimentally on synthetic data, grayscale and color images, and several benchmark data sets such as MNIST, COIL and WebKB. We also make use of fast numerical solvers for finding the eigenvectors and eigenvalues of the graph Laplacian, and take advantage of the sparsity of the matrix. Experiments indicate that the results are competitive with or better than the current state-of-the-art multiclass segmentation algorithms.


Report on the 21st International Conference on Case-Based Reasoning

AI Magazine

Springs, NY. ICCBR is the annual meeting of the CBR community and the ICCBR also featured a workshop program consisting of three workshops. The main conference track featured 16 research paper presentations, nine posters, and two invited speakers. The papers and posters reflected the state of the art of case-based reasoning, dealing both with open problems at the core of CBR (especially in similarity assessment, case adaptation, and case-based maintenance), as well as trending applications of CBR (especially recommender systems and computer games) and the intersections of CBR with other areas such as multiagent systems. The first invited speaker, Igor Jurisica from the Ontario Cancer Institute and the University of Toronto, spoke about how to scale up case-based reasoning for "big data" applications. The Case-Based Reasoning in Health Sciences workshop, organized by Isabelle Bichindaritz, Cindy Marling, and Stefania Montani, and the EXPPORT workshop (Experience Reuse: Provenance, Process-Orientation and Traces), organized by David Leake, Béatrice Fuchs, Juan A. Recio Garcia, and Stefania Montani, were held jointly and dealt with how to deal with data represented CDPHP, was the local chair; William E. University, and Jonathan Rubin, from Registration information is available at www.aaai.org/Symposia/ the Palo Alto Research Center, were the Spring/ sss14.php.


DynaLearn – An Intelligent Learning Environment for Learning Conceptual Knowledge

AI Magazine

Articulating thought in computer-based media is a powerful means for humans to develop their understanding of phenomena. We have created DynaLearn, an Intelligent Learning Environment that allows learners to acquire conceptual knowledge by constructing and simulating qualitative models of how systems behave. DynaLearn uses diagrammatic representations for learners to express their ideas. The environment is equipped with semantic technology components capable of generating knowledge-based feedback, and virtual characters enhancing the interaction with learners. Teachers have created course material, and successful evaluation studies have been performed. This article presents an overview of the DynaLearn system.


Distinguishing noise from chaos: objective versus subjective criteria using Horizontal Visibility Graph

arXiv.org Machine Learning

A recently proposed methodology called the Horizontal Visibility Graph (HVG) [Luque {\it et al.}, Phys. Rev. E., 80, 046103 (2009)] that constitutes a geometrical simplification of the well known Visibility Graph algorithm [Lacasa {\it et al.\/}, Proc. Natl. Sci. U.S.A. 105, 4972 (2008)], has been used to study the distinction between deterministic and stochastic components in time series [L. Lacasa and R. Toral, Phys. Rev. E., 82, 036120 (2010)]. Specifically, the authors propose that the node degree distribution of these processes follows an exponential functional of the form $P(\kappa)\sim \exp(-\lambda~\kappa)$, in which $\kappa$ is the node degree and $\lambda$ is a positive parameter able to distinguish between deterministic (chaotic) and stochastic (uncorrelated and correlated) dynamics. In this work, we investigate the characteristics of the node degree distributions constructed by using HVG, for time series corresponding to $28$ chaotic maps and $3$ different stochastic processes. We thoroughly study the methodology proposed by Lacasa and Toral finding several cases for which their hypothesis is not valid. We propose a methodology that uses the HVG together with Information Theory quantifiers. An extensive and careful analysis of the node degree distributions obtained by applying HVG allow us to conclude that the Fisher-Shannon information plane is a remarkable tool able to graphically represent the different nature, deterministic or stochastic, of the systems under study.


Belief Revision in Structured Probabilistic Argumentation

arXiv.org Artificial Intelligence

In real-world applications, knowledge bases consisting of all the information at hand for a specific domain, along with the current state of affairs, are bound to contain contradictory data coming from different sources, as well as data with varying degrees of uncertainty attached. Likewise, an important aspect of the effort associated with maintaining knowledge bases is deciding what information is no longer useful; pieces of information (such as intelligence reports) may be outdated, may come from sources that have recently been discovered to be of low quality, or abundant evidence may be available that contradicts them. In this paper, we propose a probabilistic structured argumentation framework that arises from the extension of Presumptive Defeasible Logic Programming (PreDeLP) with probabilistic models, and argue that this formalism is capable of addressing the basic issues of handling contradictory and uncertain data. Then, to address the last issue, we focus on the study of non-prioritized belief revision operations over probabilistic PreDeLP programs. We propose a set of rationality postulates -- based on well-known ones developed for classical knowledge bases -- that characterize how such operations should behave, and study a class of operators along with theoretical relationships with the proposed postulates, including a representation theorem stating the equivalence between this class and the class of operators characterized by the postulates.


Nonparametric Multi-group Membership Model for Dynamic Networks

Neural Information Processing Systems

Statistical analysis of social networks and other relational data is becoming an increasingly important problem as the scope and availability of network data increases. Network data--such as the friendships in a social network--is often dynamic in a sense that relations between entities rise and decay over time. A fundamental problem in the analysis of such dynamic network data is to extract a summary of the common structure and the dynamics of the underlying relations between entities. Accurate models of structure and dynamics of network data have many applications. They allow us to predict missing relationships [20, 21, 23], recommend potential new relations [2], identify clusters and groups of nodes [1, 29], forecast future links [4, 9, 11, 24], and even predict group growth and longevity [15]. Here we present a new approach to modeling network dynamics by considering time-evolving interactions between groups of nodes as well as the arrival and departure dynamics of individual nodes to these groups. We develop a dynamic network model, Dynamic Multi-group Membership Graph Model, that identifies the birth and death of individual groups as well as the dynamics of node joining and leaving groups in order to explain changes in the underlying network linking structure. Our nonparametric model considers an infinite number of latent groups, where each node can belong to multiple groups simultaneously. We capture the evolution of individual node group memberships via a Factorial Hidden Markov model.