Directed Networks
Bayesian Out-Trees
A Bayesian treatment of latent directed graph structure for non-iid data is provided where each child datum is sampled with a directed conditional dependence on a single unknown parent datum. The latent graph structure is assumed to lie in the family of directed out-tree graphs which leads to efficient Bayesian inference. The latent likelihood of the data and its gradients are computable in closed form via Tutte's directed matrix tree theorem using determinants and inverses of the out-Laplacian. This novel likelihood subsumes iid likelihood, is exchangeable and yields efficient unsupervised and semi-supervised learning algorithms. In addition to handling taxonomy and phylogenetic datasets the out-tree assumption performs surprisingly well as a semi-parametric density estimator on standard iid datasets. Experiments with unsupervised and semisupervised learning are shown on various UCI and taxonomy datasets.
Convex Point Estimation using Undirected Bayesian Transfer Hierarchies
Elidan, Gal, Packer, Ben, Heitz, Geremy, Koller, Daphne
When related learning tasks are naturally arranged in a hierarchy, an appealing approach for coping with scarcity of instances is that of transfer learning using a hierarchical Bayes framework. As fully Bayesian computations can be difficult and computationally demanding, it is often desirable to use posterior point estimates that facilitate (relatively) efficient prediction. However, the hierarchical Bayes framework does not always lend itself naturally to this maximum aposteriori goal. In this work we propose an undirected reformulation of hierarchical Bayes that relies on priors in the form of similarity measures. We introduce the notion of "degree of transfer" weights on components of these similarity measures, and show how they can be automatically learned within a joint probabilistic framework. Importantly, our reformulation results in a convex objective for many learning problems, thus facilitating optimal posterior point estimation using standard optimization techniques. In addition, we no longer require proper priors, allowing for flexible and straightforward specification of joint distributions over transfer hierarchies. We show that our framework is effective for learning models that are part of transfer hierarchies for two real-life tasks: object shape modeling using Gaussian density estimation and document classification.
Clique Matrices for Statistical Graph Decomposition and Parameterising Restricted Positive Definite Matrices
We introduce Clique Matrices as an alternative representation of undirected graphs, being a generalisation of the incidence matrix representation. Here we use clique matrices to decompose a graph into a set of possibly overlapping clusters, de ned as well-connected subsets of vertices. The decomposition is based on a statistical description which encourages clusters to be well connected and few in number. Inference is carried out using a variational approximation. Clique matrices also play a natural role in parameterising positive de nite matrices under zero constraints on elements of the matrix. We show that clique matrices can parameterise all positive de nite matrices restricted according to a decomposable graph and form a structured Factor Analysis approximation in the non-decomposable case.
Identifying reasoning patterns in games
Antos, Dimitrios, Pfeffer, Avi
We present an algorithm that identifies the reasoning patterns of agents in a game, by iteratively examining the graph structure of its Multi-Agent Influence Diagram (MAID) representation. If the decision of an agent participates in no reasoning patterns, then we can effectively ignore that decision for the purpose of calculating a Nash equilibrium for the game. In some cases, this can lead to exponential time savings in the process of equilibrium calculation. Moreover, our algorithm can be used to enumerate the reasoning patterns in a game, which can be useful for constructing more effective computerized agents interacting with humans.
Adaptive Inference on General Graphical Models
Acar, Umut A., Ihler, Alexander T., Mettu, Ramgopal, Sumer, Ozgur
Many algorithms and applications involve repeatedly solving variations of the same inference problem; for example we may want to introduce new evidence to the model or perform updates to conditional dependencies. The goal of adaptive inference is to take advantage of what is preserved in the model and perform inference more rapidly than from scratch. In this paper, we describe techniques for adaptive inference on general graphs that support marginal computation and updates to the conditional probabilities and dependencies in logarithmic time. We give experimental results for an implementation of our algorithm, and demonstrate its potential performance benefit in the study of protein structure.
Soil Data Analysis Using Classification Techniques and Soil Attribute Prediction
Gholap, Jay, Ingole, Anurag, Gohil, Jayesh, Gargade, Shailesh, Attar, Vahida
Agricultural research has been profited by technical advances such as automation, data mining. Today, data mining is used in a vast areas and many off-the-shelf data mining system products and domain specific data mining application soft wares are available, but data mining in agricultural soil datasets is a relatively a young research field. The large amounts of data that are nowadays virtually harvested along with the crops have to be analyzed and should be used to their full extent. This research aims at analysis of soil dataset using data mining techniques. It focuses on classification of soil using various algorithms available. Another important purpose is to predict untested attributes using regression technique, and implementation of automated soil sample classification.
Oriented and Degree-generated Block Models: Generating and Inferring Communities with Inhomogeneous Degree Distributions
Zhu, Yaojia, Yan, Xiaoran, Moore, Cristopher
The stochastic block model is a powerful tool for inferring community structure from network topology. However, it predicts a Poisson degree distribution within each community, while most real-world networks have a heavy-tailed degree distribution. The degree-corrected block model can accommodate arbitrary degree distributions within communities. But since it takes the vertex degrees as parameters rather than generating them, it cannot use them to help it classify the vertices, and its natural generalization to directed graphs cannot even use the orientations of the edges. In this paper, we present variants of the block model with the best of both worlds: they can use vertex degrees and edge orientations in the classification process, while tolerating heavy-tailed degree distributions within communities. We show that for some networks, including synthetic networks and networks of word adjacencies in English text, these new block models achieve a higher accuracy than either standard or degree-corrected block models.
Finding Important Genes from High-Dimensional Data: An Appraisal of Statistical Tests and Machine-Learning Approaches
Wang, Chamont, Gevertz, Jana, Chen, Chaur-Chin, Auslender, Leonardo
Over the past decades, statisticians and machine-learning researchers have developed literally thousands of new tools for the reduction of high-dimensional data in order to identify the variables most responsible for a particular trait. These tools have applications in a plethora of settings, including data analysis in the fields of business, education, forensics, and biology (such as microarray, proteomics, brain imaging), to name a few. In the present work, we focus our investigation on the limitations and potential misuses of certain tools in the analysis of the benchmark colon cancer data (2,000 variables; Alon et al., 1999) and the prostate cancer data (6,033 variables; Efron, 2010, 2008). Our analysis demonstrates that models that produce 100% accuracy measures often select different sets of genes and cannot stand the scrutiny of parameter estimates and model stability. Furthermore, we created a host of simulation datasets and "artificial diseases" to evaluate the reliability of commonly used statistical and data mining tools. We found that certain widely used models can classify the data with 100% accuracy without using any of the variables responsible for the disease. With moderate sample size and suitable pre-screening, stochastic gradient boosting will be shown to be a superior model for gene selection and variable screening from high-dimensional datasets.
Solving Limited Memory Influence Diagrams
Maua, D. D., de Campos, C. P., Zaffalon, M.
We present a new algorithm for exactly solving decision making problems represented as influence diagrams. We do not require the usual assumptions of no forgetting and regularity; this allows us to solve problems with simultaneous decisions and limited information. The algorithm is empirically shown to outperform a state-of-the-art algorithm on randomly generated problems of up to 150 variables and 10^64 solutions. We show that these problems are NP-hard even if the underlying graph structure of the problem has low treewidth and the variables take on a bounded number of states, and that they admit no provably good approximation if variables can take on an arbitrary number of states.
Tutor Modeling Versus Student Modeling
Pardos, Zachary A. (Worcester Polytechnic Institute) | Heffernan, Neil T. (Worcester Polytechnic Institute)
The current paradigm in student modeling has continued to show the power of its simplifying assumption of knowledge as a binary and monotonically increasing construct, the value of which directly causes the outcome of student answers to questions. Recent efforts have focused on optimizing the prediction accuracy of responses to questions using student models. Incorporating individual student parameter interactions has been an interpretable and principled approach which has improved the performance of this task, as demonstrated by its application in the 2010 KDD Cup challenge on Educational Data. Performance prediction, however, can have limited practical utility. The greatest utility of such student models can be their ability to model the tutor and the attributes of the tutor which are causing learning. Harnessing the same simplifying assumption of learning used in student modeling, we can turn this model on its head to effectively tease out the tutor attributes causing learning and begin to optimize the tutor model to benefit the student model.