Industry
On the geometric structure of fMRI searchlight-based information maps
Viswanathan, Shivakumar, Cieslak, Matthew, Grafton, Scott T.
Information mapping is a popular application of Multivoxel Pattern Analysis (MVPA) to fMRI. Information maps are constructed using the so called searchlight method, where the spherical multivoxel neighborhood of every voxel (i.e., a searchlight) in the brain is evaluated for the presence of task-relevant response patterns. One such challenge has to do with inferring the size and shape of a multivoxel pattern from its signature on the information map. To address this issue, we formally examined the geometric basis of this mapping relationship. Based on geometric considerations, we show how and why small patterns (i.e., having smaller spatial extents) can produce a larger signature on the information map as compared to large patterns, independent of the size of the searchlight radius. Furthermore, we show that the number of informative searchlights over the brain increase as a function of searchlight radius, even in the complete absence of any multivariate response patterns. These properties are unrelated to the statistical capabilities of the pattern-analysis algorithms used but are obligatory geometric properties arising from using the searchlight procedure.
Reasoning over Ontologies with Hidden Content: The Import-by-Query Approach
There is currently a growing interest in techniques for hiding parts of the signature of an ontology Kh that is being reused by another ontology Kv. Towards this goal, in this paper we propose the import-by-query framework, which makes the content of Kh accessible through a limited query interface. If Kv reuses the symbols from Kh in a certain restricted way, one can reason over Kv U Kh by accessing only Kv and the query interface. We map out the landscape of the import-by-query problem. In particular, we outline the limitations of our framework and prove that certain restrictions on the expressivity of Kh and the way in which Kv reuses symbols from Kh are strictly necessary to enable reasoning in our setting. We also identify cases in which reasoning is possible and we present suitable import-by-query reasoning algorithms.
Supervised Learning with Similarity Functions
Kar, Purushottam, Jain, Prateek
We address the problem of general supervised learning when data can only be accessed through an (indefinite) similarity function between data points. Existing work on learning with indefinite kernels has concentrated solely on binary/multi-class classification problems. We propose a model that is generic enough to handle any supervised learning task and also subsumes the model previously proposed for classification. We give a "goodness" criterion for similarity functions w.r.t. a given supervised learning task and then adapt a well-known landmarking technique to provide efficient algorithms for supervised learning using "good" similarity functions. We demonstrate the effectiveness of our model on three important super-vised learning problems: a) real-valued regression, b) ordinal regression and c) ranking where we show that our method guarantees bounded generalization error. Furthermore, for the case of real-valued regression, we give a natural goodness definition that, when used in conjunction with a recent result in sparse vector recovery, guarantees a sparse predictor with bounded generalization error. Finally, we report results of our learning algorithms on regression and ordinal regression tasks using non-PSD similarity functions and demonstrate the effectiveness of our algorithms, especially that of the sparse landmark selection algorithm that achieves significantly higher accuracies than the baseline methods while offering reduced computational costs.
A Generalized Mean Field Algorithm for Variational Inference in Exponential Families
Xing, Eric P., Jordan, Michael I., Russell, Stuart
The mean field methods, which entail approximating intractable probability distributions variationally with distributions from a tractable family, enjoy high efficiency, guaranteed convergence, and provide lower bounds on the true likelihood. But due to requirement for model-specific derivation of the optimization equations and unclear inference quality in various models, it is not widely used as a generic approximate inference algorithm. In this paper, we discuss a generalized mean field theory on variational approximation to a broad class of intractable distributions using a rich set of tractable distributions via constrained optimization over distribution spaces. We present a class of generalized mean field (GMF) algorithms for approximate inference in complex exponential family models, which entails limiting the optimization over the class of cluster-factorizable distributions. GMF is a generic method requiring no model-specific derivations. It factors a complex model into a set of disjoint variable clusters, and uses a set of canonical fix-point equations to iteratively update the cluster distributions, and converge to locally optimal cluster marginals that preserve the original dependency structure within each cluster, hence, fully decomposed the overall inference problem. We empirically analyzed the effect of different tractable family (clusters of different granularity) on inference quality, and compared GMF with BP on several canonical models. Possible extension to higher-order MF approximation is also discussed.
Coalition Structure Generation over Graphs
Voice, T., Polukarov, M., Jennings, N. R.
We give the analysis of the computational complexity of coalition structure generation over graphs. Given an undirected graph G = (N,E) and a valuation function v : P(N) โ R over the subsets of nodes, the problem is to find a partition of N into connected subsets, that maximises the sum of the components values. This problem is generally NPcomplete; in particular, it is hard for a defined class of valuation functions which are independent of disconnected membersthat is, two nodes have no effect on each others marginal con- tribution to their vertex separator. Nonetheless, for all such functions we provide bounds on the complexity of coalition structure generation over general and minorfree graphs. Our proof is constructive and yields algorithms for solving corresponding instances of the problem. Furthermore, we derive linear time bounds for graphs of bounded treewidth. However, as we show, the problem remains NPcomplete for planar graphs, and hence, for any K_k minorfree graphs where k โฅ 5. Moreover, a 3-SAT problem with m clauses can be represented by a coalition structure generation problem over a planar graph with O(m^2) nodes. Importantly, our hardness result holds for a particular subclass of valuation functions, termed edge sum, where the value of each subset of nodes is simply determined by the sum of given weights of the edges in the induced subgraph.
A Distance-Based Branch and Bound Feature Selection Algorithm
Frank, Ari, Geiger, Dan, Yakhini, Zohar
There is no known efficient method for selecting k Gaussian features from n which achieve the lowest Bayesian classification error. We show an example of how greedy algorithms faced with this task are led to give results that are not optimal. This motivates us to propose a more robust approach. We present a Branch and Bound algorithm for finding a subset of k independent Gaussian features which minimizes the naive Bayesian classification error. Our algorithm uses additive monotonic distance measures to produce bounds for the Bayesian classification error in order to exclude many feature subsets from evaluation, while still returning an optimal solution. We test our method on synthetic data as well as data obtained from gene expression profiling.
Locally Weighted Naive Bayes
Frank, Eibe, Hall, Mark, Pfahringer, Bernhard
Despite its simplicity, the naive Bayes classifier has surprised machine learning researchers by exhibiting good performance on a variety of learning problems. Encouraged by these results, researchers have looked to overcome naive Bayes primary weakness - attribute independence - and improve the performance of the algorithm. This paper presents a locally weighted version of naive Bayes that relaxes the independence assumption by learning local models at prediction time. Experimental results show that locally weighted naive Bayes rarely degrades accuracy compared to standard naive Bayes and, in many cases, improves accuracy dramatically. The main advantage of this method compared to other techniques for enhancing naive Bayes is its conceptual and computational simplicity.
Learning Module Networks
Segal, Eran, Pe'er, Dana, Regev, Aviv, Koller, Daphne, Friedman, Nir
Methods for learning Bayesian network structure can discover dependency structure between observed variables, and have been shown to be useful in many applications. However, in domains that involve a large number of variables, the space of possible network structures is enormous, making it difficult, for both computational and statistical reasons, to identify a good model. In this paper, we consider a solution to this problem, suitable for domains where many variables have similar behavior. Our method is based on a new class of models, which we call module networks. A module network explicitly represents the notion of a module - a set of variables that have the same parents in the network and share the same conditional probability distribution. We define the semantics of module networks, and describe an algorithm that learns a module network from data. The algorithm learns both the partitioning of the variables into modules and the dependency structure between the variables. We evaluate our algorithm on synthetic data, and on real data in the domains of gene expression and the stock market. Our results show that module networks generalize better than Bayesian networks, and that the learned module network structure reveals regularities that are obscured in learned Bayesian networks.
Learning Continuous Time Bayesian Networks
Nodelman, Uri, Shelton, Christian R., Koller, Daphne
Continuous time Bayesian networks (CTBNs) describe structured stochastic processes with finitely many states that evolve over continuous time. A CTBN is a directed (possibly cyclic) dependency graph over a set of variables, each of which represents a finite state continuous time Markov process whose transition model is a function of its parents. We address the problem of learning parameters and structure of a CTBN from fully observed data. We define a conjugate prior for CTBNs, and show how it can be used both for Bayesian parameter estimation and as the basis of a Bayesian score for structure learning. Because acyclicity is not a constraint in CTBNs, we can show that the structure learning problem is significantly easier, both in theory and in practice, than structure learning for dynamic Bayesian networks (DBNs). Furthermore, as CTBNs can tailor the parameters and dependency structure to the different time granularities of the evolution of different variables, they can provide a better fit to continuous-time processes than DBNs with a fixed time granularity.
Budgeted Learning of Naive-Bayes Classifiers
Lizotte, Daniel J., Madani, Omid, Greiner, Russell
There is almost always a cost associated with acquiring training data. We consider the situation where the learner, with a fixed budget, may'purchase' data during training. In particular, we examine the case where observing the value of a feature of a training example has an associated cost, and the total cost of all feature values acquired during training must remain less than this fixed budget. This paper compares methods for sequentially choosing which feature value to purchase next, given the budget and user's current knowledge of Na'ive Bayes model parameters. Whereas active learning has traditionally focused on myopic (greedy) approaches and uniform/round-robin policies for query selection, this paper shows that such methods are often suboptimal and presents a tractable method for incorporating knowledge of the budget in the information acquisition process.