Goto

Collaborating Authors

 Learning Graphical Models


A Hybrid Monte Carlo Architecture for Parameter Optimization

arXiv.org Machine Learning

Much recent research has been conducted in the area of Bayesian learning, particularly with regard to the optimization of hyper-parameters via Gaussian process regression. The methodologies rely chiefly on the method of maximizing the expected improvement of a score function with respect to adjustments in the hyper-parameters. In this work, we present a novel algorithm that exploits notions of confidence intervals and uncertainties to enable the discovery of the best optimal within a targeted region of the parameter space. We demonstrate the efficacy of our algorithm with respect to machine learning problems and show cases where our algorithm is competitive with the method of maximizing expected improvement.


Kaggle LSHTC4 Winning Solution

arXiv.org Artificial Intelligence

Our winning submission to the 2014 Kaggle competition for Large Scale Hierarchical Text Classification (LSHTC) consists mostly of an ensemble of sparse generative models extending Multinomial Naive Bayes. The base-classifiers consist of hierarchically smoothed models combining document, label, and hierarchy level Multinomials, with feature pre-processing using variants of TF-IDF and BM25. Additional diversification is introduced by different types of folds and random search optimization for different measures. The ensemble algorithm optimizes macroFscore by predicting the documents for each label, instead of the usual prediction of labels per document. Scores for documents are predicted by weighted voting of base-classifier outputs with a variant of Feature-Weighted Linear Stacking. The number of documents per label is chosen using label priors and thresholding of vote scores. This document describes the models and software used to build our solution. Reproducing the results for our solution can be done by running the scripts included in the Kaggle package. A package omitting precomputed result files is also distributed. All code is open source, released under GNU GPL 2.0, and GPL 3.0 for Weka and Meka dependencies.


Topic-Based Dissimilarity and Sensitivity Models for Translation Rule Selection

Journal of Artificial Intelligence Research

Translation rule selection is a task of selecting appropriate translation rules for an ambiguous source-language segment. As translation ambiguities are pervasive in statistical machine translation, we introduce two topic-based models for translation rule selection which incorporates global topic information into translation disambiguation. We associate each synchronous translation rule with source- and target-side topic distributions.With these topic distributions, we propose a topic dissimilarity model to select desirable (less dissimilar) rules by imposing penalties for rules with a large value of dissimilarity of their topic distributions to those of given documents. In order to encourage the use of non-topic specific translation rules, we also present a topic sensitivity model to balance translation rule selection between generic rules and topic-specific rules. Furthermore, we project target-side topic distributions onto the source-side topic model space so that we can benefit from topic information of both the source and target language. We integrate the proposed topic dissimilarity and sensitivity model into hierarchical phrase-based machine translation for synchronous translation rule selection. Experiments show that our topic-based translation rule selection model can substantially improve translation quality.


Reproducing kernel Hilbert space based estimation of systems of ordinary differential equations

arXiv.org Machine Learning

Nonlinear systems of differential equations have attracted the interest in fields like system biology, ecology or biochemistry, due to their flexibility and their ability to describe dynamical systems. Despite the importance of such models in many branches of science they have not been the focus of systematic statistical analysis until recently. In this work we propose a general approach to estimate the parameters of systems of differential equations measured with noise. Our methodology is based on the maximization of the penalized likelihood where the system of differential equations is used as a penalty. To do so, we use a Reproducing Kernel Hilbert Space approach that allows us to formulate the estimation problem as an unconstrained numeric maximization problem easy to solve. The proposed method is tested with synthetically simulated data and it is used to estimate the unobserved transcription factor CdaR in Steptomyes coelicolor using gene expression data of the genes it regulates. Keywords: System of ordinary differential equations, differential operator, reproducing kernel Hilbert space, gene regulatory network 1. Introduction Despite the fact that differential equations are a common modelling tool within science and engineering, statistical methods for estimating such models have only received widespread attention during the last few years. The difficulty of solving differential equations in general has been a major stumbling block for efficient statistical procedures.


Integrating Vague Association Mining with Markov Model

arXiv.org Artificial Intelligence

The increasing demand of World Wide Web raises the need of predicting the user's web page request. The most widely used approach to predict the web pages is the pattern discovery process of Web usage mining. This process involves inevitability of many techniques like Markov model, association rules and clustering. Fuzzy theory with different techniques has been introduced for the better results. Our focus is on Markov models. This paper is introducing the vague Rules with Markov models for more accuracy using the vague set theory.


A Vague Improved Markov Model Approach for Web Page Prediction

arXiv.org Artificial Intelligence

Today most of the information in all areas is available over the web. It increases the web utilization as well as attracts the interest of researchers to improve the effectiveness of web access and web utilization. As the number of web clients gets increased, the bandwidth sharing is performed that decreases the web access efficiency. Web page prefetching improves the effectiveness of web access by availing the next required web page before the user demand. It is an intelligent predictive mining that analyze the user web access history and predict the next page. In this work, vague improved markov model is presented to perform the prediction. In this work, vague rules are suggested to perform the pruning at different levels of markov model. Once the prediction table is generated, the association mining will be implemented to identify the most effective next page. In this paper, an integrated model is suggested to improve the prediction accuracy and effectiveness.


Factored Performance Functions with Structural Representation in Continuous Time Bayesian Networks

AAAI Conferences

The continuous time Bayesian network (CTBN) is a probabilistic graphical model that enables reasoning about complex, interdependent, and continuous-time subsystems. The model uses nodes to denote subsystems and arcs to denote conditional dependence. This dependence manifests in how the dynamics of a subsystem change based on the current states of its parents in the network. While the original CTBN definition allows users to specify the dynamics of how the system evolves, users might also want to place value expressions over the dynamics of the model in the form of performance functions. We formalize these performance functions for the CTBN and show how they can be factored in the same way as the network, allowing what we argue is a more intuitive and explicit representation. For cases in which a performance function must involve multiple nodes, we show how to augment the structure of the CTBN to account for the performance interaction while maintaining the factorization of a single performance function for each node.


An Empirical Evaluation of Costs and Benefits of Simplifying Bayesian Networks by Removing Weak Arcs

AAAI Conferences

We report the results of an empirical evaluation of structural simplification of Bayesian networks by removing weak arcs. We conduct a series of experiments on six networks built from real data sets selected from the UC Irvine Machine Learning Repository. We systematically remove arcs from the weakest to the strongest, relying on four measures of arc strength, and measure the classification accuracy of the resulting simplified models. Our results show that removing up to roughly 20 percent of the weakest arcs in a network has minimal effect on its classification accuracy. At the same time, structural simplification of networks leads to significant reduction of both the amount of memory taken by the clique tree and the amount of computation needed to perform inference.


Hybrid Intelligence for Semantics-Enhanced Networking Operations

AAAI Conferences

Endowing the semantically-oblivious Internet with Intelligence would advance the Internet capability to learn traffic behavior and to predict future events. In this paper, we propose a hybrid intelligence memory system, or NetMem, for network-semantics reasoning and targeting Internet intelligence. NetMem provides a memory structure, mimicking the human memory functionalities, via short-term memory (StM) and long-term memory (LtM). NetMem has the capability to build runtime accessible dynamic network-concept ontology (DNCO) at different levels of granularity. We integrate Latent Dirichlet Allocation (LDA) and Hidden Markov Models (HMM) to extract network-semantics based on learning patterns and recognizing features with syntax and semantic dependencies. Due to the large scale and high-dimensionality of Internet data, we utilize the Locality Sensitive Hashing (LSH) algorithm for data dimensionality reduction. Simulation results using real network traffic show that NetMem with hybrid intelligence learn traffic data semantics effectively and efficiently even with significant reduction in volume and dimensionality of data, thus enhancing Internet intelligence for self-/situation-awareness and event/behavior prediction.


Probabilistic Failure Isolation for Cognitive Robots

AAAI Conferences

Robots may encounter undesirable outcomes due to failures during the execution of their plans in the physical world. Failures should be detected, and the underlying reasons should be found by the robot in order to handle these failure situations efficiently. Sometimes, there may be more than one cause of a failure, and they are not necessarily related to the action in execution. In this paper, we propose a temporal and Hierarchical Hidden Markov Model (HHMM) based failure isolation method. These HHMMs run in parallel to determine causes of unexpected deviations. Experiments on our Pioneer 3-AT robot show that our method successfully isolates failures suggesting possible causes.