Search
Analysis of a greedy active learning strategy
We abstract out the core search problem of active learning schemes, to better understand the extent to which adaptive labeling can improve sample complexity. We give various upper and lower bounds on the number of labels which need to be queried, and we prove that a popular greedy active learning rule is approximately as good as any other strategy for minimizing this number of labels.
Analysis of a greedy active learning strategy
We abstract out the core search problem of active learning schemes, to better understand the extent to which adaptive labeling can improve sample complexity. We give various upper and lower bounds on the number of labels which need to be queried, and we prove that a popular greedy active learning rule is approximately as good as any other strategy for minimizing this number of labels.
Identifying Protein-Protein Interaction Sites on a Genome-Wide Scale
Wang, Haidong, Segal, Eran, Ben-Hur, Asa, Koller, Daphne, Brutlag, Douglas L.
Protein interactions typically arise from a physical interaction of one or more small sites on the surface of the two proteins. Identifying these sites is very important for drug and protein design. In this paper, we propose a computational method based on probabilistic relational model that attempts to address this task using high-throughput protein interaction data and a set of short sequence motifs. We learn the model using the EM algorithm, with a branch-and-bound algorithm as an approximate inference for the E-step. Our method searches for motifs whose presence in a pair of interacting proteins can explain their observed interaction. It also tries to determine which motif pairs have high affinity, and can therefore lead to an interaction. We show that our method is more accurate than others at predicting new protein-protein interactions. More importantly, by examining solved structures of protein complexes, we find that 2/3 of the predicted active motifs correspond to actual interaction sites.
Optimal sub-graphical models
Narasimhan, Mukund, Bilmes, Jeff A.
We do this by first defining a decomposition tree representation for G, which is closely related to the junction-tree representation for G. We then give an algorithm which uses this representation to compute the optimal H H. Gavril [2] and Tarjan [3] have used graph separation properties to solve several combinatorial optimization problems when the size of the minimal separators in the graph is bounded. We present an extension of this technique which applies to some important choices of H even when the size of the minimal separators of G are arbitrarily large. In particular, this applies to problems such as finding an optimal subgraphical model over a (k 1)-tree of a graphical model over a k-tree (for arbitrary k) and selecting an optimal subgraphical model with (a constant) d fewer edges with respect to KL-divergence can be solved in time polynomial in V (G) using this formulation.
An Application of Boosting to Graph Classification
Kudo, Taku, Maeda, Eisaku, Matsumoto, Yuji
This paper presents an application of Boosting for classifying labeled graphs, general structures for modeling a number of real-world data, such as chemical compounds, natural language texts, and bio sequences. The proposal consists of i) decision stumps that use subgraph as features, and ii) a Boosting algorithm in which subgraph-based decision stumps are used as weak learners. We also discuss the relation between our algorithm and SVMs with convolution kernels. Two experiments using natural language data and chemical compounds show that our method achieves comparable or even better performance than SVMs with convolution kernels as well as improves the testing efficiency.
Analysis of a greedy active learning strategy
We abstract out the core search problem of active learning schemes, to better understand the extent to which adaptive labeling can improve sample complexity.We give various upper and lower bounds on the number of labels which need to be queried, and we prove that a popular greedy active learning rule is approximately as good as any other strategy for minimizing this number of labels.
Optimal sub-graphical models
Narasimhan, Mukund, Bilmes, Jeff A.
We do this by first defining a decomposition tree representation for G, which is closely related to the junction-tree representation for G. We then give an algorithm which uses this representation to compute the optimal H H. Gavril [2] and Tarjan [3] have used graph separation properties to solve several combinatorial optimization problems when the size of the minimal separators in the graph is bounded. We present an extension of this technique which applies to some important choices of H even when the size of the minimal separators of G are arbitrarily large. In particular, this applies to problems such as finding an optimal subgraphical model over a (k 1)-tree of a graphical model over a k-tree (for arbitrary k) and selecting an optimal subgraphical model with (a constant) d fewer edges with respect to KL-divergence can be solved in time polynomial in V (G) using this formulation.