Goto

Collaborating Authors

 Inductive Learning


Towards The Inductive Acquisition of Temporal Knowledge

arXiv.org Artificial Intelligence

The ability to predict the future in a given domain can be acquired by discovering empirically from experience certain temporal patterns that tend to repeat unerringly. Previous works in time series analysis allow one to make quantitative predictions on the likely values of certain linear variables. Since certain types of knowledge are better expressed in symbolic forms, making qualitative predictions based on symbolic representations require a different approach. A domain independent methodology called TIM (Time based Inductive Machine) for discovering potentially uncertain temporal patterns from real time observations using the technique of inductive inference is described here.


Induction, of and by Probability

arXiv.org Artificial Intelligence

This paper examines some methods and ideas underlying the author's successful probabilistic learning systems(PLS), which have proven uniquely effective and efficient in generalization learning or induction. While the emerging principles are generally applicable, this paper illustrates them in heuristic search, which demands noise management and incremental learning. In our approach, both task performance and learning are guided by probability. Probabilities are incrementally normalized and revised, and their errors are located and corrected.


Link prediction for partially observed networks

arXiv.org Machine Learning

Link prediction is one of the fundamental problems in network analysis. In many applications, notably in genetics, a partially observed network may not contain any negative examples of absent edges, which creates a difficulty for many existing supervised learning approaches. We develop a new method which treats the observed network as a sample of the true network with different sampling rates for positive and negative examples. We obtain a relative ranking of potential links by their probabilities, utilizing information on node covariates as well as on network topology. Empirically, the method performs well under many settings, including when the observed network is sparse. We apply the method to a protein-protein interaction network and a school friendship network.


Evaluation of a Supervised Learning Approach for Stock Market Operations

arXiv.org Machine Learning

Stock markets play a fundamental role in the countries' economies, since they allow companies to raise funds for their investments in technology, expansion or infrastructure by selling stocks to the public. At the same time, stocks are, for the stockholders, important assets that can help to maintain or increase the investor's wealth for future use, like retirement, education, etc. On the other hand, stock prices are volatile and depend on several factors like companies' performances, economic activity, etc. Hence, investors and funds managers usually must constantly monitor the behavior of stock prices, in order to take correct trading decisions and to avoid excessive exposition to risky stocks. Data mining techniques have been widely proposed for stock market analysis in order to identify some patterns in price time series.


Combining Feature and Prototype Pruning by Uncertainty Minimization

arXiv.org Machine Learning

We focus in this paper on dataset reduction techniques for use in k-nearest neighbor classification. In such a context, feature and prototype selections have always been independently treated by the standard storage reduction algorithms. While this certifying is theoretically justified by the fact that each subproblem is NP-hard, we assume in this paper that a joint storage reduction is in fact more intuitive and can in practice provide better results than two independent processes. Moreover, it avoids a lot of distance calculations by progressively removing useless instances during the feature pruning. While standard selection algorithms often optimize the accuracy to discriminate the set of solutions, we use in this paper a criterion based on an uncertainty measure within a nearest-neighbor graph. This choice comes from recent results that have proven that accuracy is not always the suitable criterion to optimize. In our approach, a feature or an instance is removed if its deletion improves information of the graph. Numerous experiments are presented in this paper and a statistical analysis shows the relevance of our approach, and its tolerance in the presence of noise.


Learning from Distributions via Support Measure Machines

arXiv.org Machine Learning

This paper presents a kernel-based discriminative learning framework on probability measures. Rather than relying on large collections of vectorial training examples, our framework learns using a collection of probability distributions that have been constructed to meaningfully represent training data. By representing these probability distributions as mean embeddings in the reproducing kernel Hilbert space (RKHS), we are able to apply many standard kernel-based learning techniques in straightforward fashion. To accomplish this, we construct a generalization of the support vector machine (SVM) called a support measure machine (SMM). Our analyses of SMMs provides several insights into their relationship to traditional SVMs. Based on such insights, we propose a flexible SVM (Flex-SVM) that places different kernel functions on each training example. Experimental results on both synthetic and real-world data demonstrate the effectiveness of our proposed framework.


Determinantal point processes for machine learning

arXiv.org Machine Learning

Determinantal point processes (DPPs) are elegant probabilistic models of repulsion that arise in quantum physics and random matrix theory. In contrast to traditional structured models like Markov random fields, which become intractable and hard to approximate in the presence of negative correlations, DPPs offer efficient and exact algorithms for sampling, marginalization, conditioning, and other inference tasks. We provide a gentle introduction to DPPs, focusing on the intuitions, algorithms, and extensions that are most relevant to the machine learning community, and show how DPPs can be applied to real-world applications like finding diverse sets of high-quality search results, building informative summaries by selecting diverse sentences from documents, modeling non-overlapping human poses in images or video, and automatically building timelines of important news stories.


A Conditional Multinomial Mixture Model for Superset Label Learning

Neural Information Processing Systems

In the superset label learning problem (SLL), each training instance provides a set of candidate labels of which one is the true label of the instance. As in ordinary regression, the candidate label set is a noisy version of the true label. In this work, we solve the problem by maximizing the likelihood of the candidate label sets of training instances. We propose a probabilistic model, the Logistic Stick-Breaking Conditional Multinomial Model (LSB-CMM), to do the job. The LSB-CMM is derived from the logistic stick-breaking process. It first maps data points to mixture components and then assigns to each mixture component a label drawn from a component-specific multinomial distribution.


A Conditional Multinomial Mixture Model for Superset Label Learning

Neural Information Processing Systems

In the superset label learning problem (SLL), each training instance provides a set of candidate labels of which one is the true label of the instance. As in ordinary regression, the candidate label set is a noisy version of the true label. In this work, we solve the problem by maximizing the likelihood of the candidate label sets of training instances. We propose a probabilistic model, the Logistic Stick-Breaking Conditional Multinomial Model (LSB-CMM), to do the job. The LSB-CMM is derived from the logistic stick-breaking process. It first maps data points to mixture components and then assigns to each mixture component a label drawn from a component-specific multinomial distribution.


Perceptron Learning of SAT

Neural Information Processing Systems

Boolean satisfiability (SAT) as a canonical NP-complete decision problem is one of the most important problems in computer science. In practice, real-world SAT sentences are drawn from a distribution that may result in efficient algorithms for their solution. Such SAT instances are likely to have shared characteristics and substructures. This work approaches the exploration of a family of SAT solvers as a learning problem. In particular, we relate polynomial time solvability of a SAT subset to a notion of margin between sentences mapped by a feature function into a Hilbert space. Provided this mapping is based on polynomial time computable statistics of a sentence, we show that the existance of a margin between these data points implies the existance of a polynomial time solver for that SAT subset based on the Davis-Putnam-Logemann-Loveland algorithm. Furthermore, we show that a simple perceptron-style learning rule will find an optimal SAT solver with a bounded number of training updates. We derive a linear time computable set of features and show analytically that margins exist for important polynomial special cases of SAT. Empirical results show an order of magnitude improvement over a state-of-the-art SAT solver on a hardware verification task.