Goto

Collaborating Authors

 Industry


The Information Bottleneck EM Algorithm

arXiv.org Machine Learning

Learning with hidden variables is a central challenge in probabilistic graphical models that has important implications for many real-life problems. The classical approach is using the Expectation Maximization (EM) algorithm. This algorithm, however, can get trapped in local maxima. In this paper we explore a new approach that is based on the Information Bottleneck principle. In this approach, we view the learning problem as a tradeoff between two information theoretic objectives. The first is to make the hidden variables uninformative about the identity of specific instances. The second is to make the hidden variables informative about the observed attributes. By exploring different tradeoffs between these two objectives, we can gradually converge on a high-scoring solution. As we show, the resulting, Information Bottleneck Expectation Maximization (IB-EM) algorithm, manages to find solutions that are superior to standard EM methods.


Wikipedia Vandalism Detection Through Machine Learning: Feature Review and New Proposals: Lab Report for PAN at CLEF 2010

arXiv.org Artificial Intelligence

Wikipedia is an online encyclopedia built upon the collaborations of thousands of editors. Its collaboration model is simple: anyone can edit any article at any time. This has made possible the great success of Wikipedia, but it comes with its own problems, one of them being destructive edits. There are many ways in which an edit can be destructive for Wikipedia, such as lobbying, spam, vandalism, tests, etc. In PAN 2010 Lab's Task 2 we are focused on automatic detection of vandalism. The English Wikipedia defines vandalism as: [...] any addition, removal, or change of content made in a deliberate attempt to compromise the integrity of Wikipedia.


Exploiting Locality in Searching the Web

arXiv.org Artificial Intelligence

Published experiments on spidering the Web suggest that, given training data in the form of a (relatively small) subgraph of the Web containing a subset of a selected class of target pages, it is possible to conduct a directed search and find additional target pages significantly faster (with fewer page retrievals) than by performing a blind or uninformed random or systematic search, e.g., breadth-first search. If true, this claim motivates a number of practical applications. Unfortunately, these experiments were carried out in specialized domains or under conditions that are difficult to replicate. We present and apply an experimental framework designed to reexamine and resolve the basic claims of the earlier work, so that the supporting experiments can be replicated and built upon. We provide high-performance tools for building experimental spiders, make use of the ground truth and static nature of the WT10g TREC Web corpus, and rely on simple well understand machine learning techniques to conduct our experiments. In this paper, we describe the basic framework, motivate the experimental design, and report on our findings supporting and qualifying the conclusions of the earlier research.


Optimal Limited Contingency Planning

arXiv.org Artificial Intelligence

For a given problem, the optimal Markov policy can be considerred as a conditional or contingent plan containing a (potentially large) number of branches. Unfortunately, there are applications where it is desirable to strictly limit the number of decision points and branches in a plan. For example, it may be that plans must later undergo more detailed simulation to verify correctness and safety, or that they must be simple enough to be understood and analyzed by humans. As a result, it may be necessary to limit consideration to plans with only a small number of branches. This raises the question of how one goes about finding optimal plans containing only a limited number of branches. In this paper, we present an any-time algorithm for optimal k-contingency planning (OKP). It is the first optimal algorithm for limited contingency planning that is not an explicit enumeration of possible contingent plans. By modelling the problem as a Partially Observable Markov Decision Process, it implements the Bellman optimality principle and prunes the solution space. We present experimental results of applying this algorithm to some simple test cases.


Dealing with uncertainty in fuzzy inductive reasoning methodology

arXiv.org Artificial Intelligence

The aim of this research is to develop a reasoning under uncertainty strategy in the context of the Fuzzy Inductive Reasoning (FIR) methodology. FIR emerged from the General Systems Problem Solving developed by G. Klir. It is a data driven methodology based on systems behavior rather than on structural knowledge. It is a very useful tool for both the modeling and the prediction of those systems for which no previous structural knowledge is available. FIR reasoning is based on pattern rules synthesized from the available data. The size of the pattern rule base can be very large making the prediction process quite difficult. In order to reduce the size of the pattern rule base, it is possible to automatically extract classical Sugeno fuzzy rules starting from the set of pattern rules. The Sugeno rule base preserves pattern rules knowledge as much as possible. In this process some information is lost but robustness is considerably increased. In the forecasting process either the pattern rule base or the Sugeno fuzzy rule base can be used. The first option is desirable when the computational resources make it possible to deal with the overall pattern rule base or when the extracted fuzzy rules are not accurate enough due to uncertainty associated to the original data. In the second option, the prediction process is done by means of the classical Sugeno inference system. If the amount of uncertainty associated to the data is small, the predictions obtained using the Sugeno fuzzy rule base will be very accurate. In this paper a mixed pattern/fuzzy rules strategy is proposed to deal with uncertainty in such a way that the best of both perspectives is used. Areas in the data space with a higher level of uncertainty are identified by means of the so-called error models. The prediction process in these areas makes use of a mixed pattern/fuzzy rules scheme, whereas areas identified with a lower level of uncertainty only use the Sugeno fuzzy rule base. The proposed strategy is applied to a real biomedical system, i.e., the central nervous system control of the cardiovascular system.


On Local Optima in Learning Bayesian Networks

arXiv.org Artificial Intelligence

This paper proposes and evaluates the k-greedy equivalence search algorithm (KES) for learning Bayesian networks (BNs) from complete data. The main characteristic of KES is that it allows a trade-off between greediness and randomness, thus exploring different good local optima. When greediness is set at maximum, KES corresponds to the greedy equivalence search algorithm (GES). When greediness is kept at minimum, we prove that under mild assumptions KES asymptotically returns any inclusion optimal BN with nonzero probability. Experimental results for both synthetic and real data are reported showing that KES often finds a better local optima than GES. Moreover, we use KES to experimentally confirm that the number of different local optima is often huge.


Marginalizing Out Future Passengers in Group Elevator Control

arXiv.org Artificial Intelligence

Group elevator scheduling is an NPhard sequential decision-making problem with unbounded state spaces and substantial uncertainty. Decision-theoretic reasoning plays a surprisingly limited role in fielded systems. A new opportunity for probabilistic methods has opened with the recent discovery of a tractable solution for the expected waiting times of all passengers in the building, marginalized over all possible passenger itineraries [Nikovski and Brand, 2003]. Though commercially competitive, this solution does not contemplate future passengers. Yet in up-peak traffic, the effects of future passengers arriving at the lobby and entering elevator cars can dominate all waiting times. We develop a probabilistic model of how these arrivals affect the behavior of elevator cars at the lobby, and demonstrate how this model can be used to very significantly reduce the average waiting time of all passengers.


A Linear Belief Function Approach to Portfolio Evaluation

arXiv.org Artificial Intelligence

By elaborating on the notion of linear belief functions (Dempster 1990; Liu 1996), we propose an elementary approach to knowledge representation for expert systems using linear belief functions. We show how to use basic matrices to represent market information and financial knowledge, including complete ignorance, statistical observations, subjective speculations, distributional assumptions, linear relations, and empirical asset pricing models. We then appeal to Dempster's rule of combination to integrate the knowledge for assessing an overall belief of portfolio performance, and updating the belief by incorporating additional information. We use an example of three gold stocks to illustrate the approach.


Loopy Belief Propagation as a Basis for Communication in Sensor Networks

arXiv.org Artificial Intelligence

Sensor networks are an exciting new kind of computer system. Consisting of a large number of tiny, cheap computational devices physically distributed in an environment, they gather and process data about the environment in real time. One of the central questions in sensor networks is what to do with the data, i.e., how to reason with it and how to communicate it. This paper argues that the lessons of the UAI community, in particular that one should produce and communicate beliefs rather than raw sensor values, are highly relevant to sensor networks. We contend that loopy belief propagation is particularly well suited to communicating beliefs in sensor networks, due to its compact implementation and distributed nature. We investigate the ability of loopy belief propagation to function under the stressful conditions likely to prevail in sensor networks. Our experiments show that it performs well and degrades gracefully. It converges to appropriate beliefs even in highly asynchronous settings where some nodes communicate far less frequently than others; it continues to function if some nodes fail to participate in the propagation process; and it can track changes in the environment that occur while beliefs are propagating. As a result, we believe that sensor networks present an important application opportunity for UAI.


New Advances in Inference by Recursive Conditioning

arXiv.org Artificial Intelligence

Recursive Conditioning (RC) was introduced recently as the first any-space algorithm for inference in Bayesian networks which can trade time for space by varying the size of its cache at the increment needed to store a floating point number. Under full caching, RC has an asymptotic time and space complexity which is comparable to mainstream algorithms based on variable elimination and clustering (exponential in the network treewidth and linear in its size). We show two main results about RC in this paper. First, we show that its actual space requirements under full caching are much more modest than those needed by mainstream methods and study the implications of this finding. Second, we show that RC can effectively deal with determinism in Bayesian networks by employing standard logical techniques, such as unit resolution, allowing a significant reduction in its time requirements in certain cases. We illustrate our results using a number of benchmark networks, including the very challenging ones that arise in genetic linkage analysis.