AITopics | Country

Collaborating Authors

Country

Oracle inequalities for computationally adaptive model selection

Agarwal, Alekh, Bartlett, Peter L., Duchi, John C.

arXiv.org Machine LearningAug-1-2012

We analyze general model selection procedures using penalized empirical loss minimization under computational constraints. While classical model selection approaches do not consider computational aspects of performing model selection, we argue that any practical model selection procedure must not only trade off estimation and approximation error, but also the computational effort required to compute empirical minimizers for different function classes. We provide a framework for analyzing such problems, and we give algorithms for model selection under a computational budget. These algorithms satisfy oracle inequalities that show that the risk of the selected model is not much worse than if we had devoted all of our omputational budget to the optimal function class.

artificial intelligence, inequality, machine learning, (17 more...)

arXiv.org Machine Learning

1208.0129

Country:

Oceania > Australia (0.28)
North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Fast Planar Correlation Clustering for Image Segmentation

Yarkony, Julian, Ihler, Alexander T., Fowlkes, Charless C.

arXiv.org Machine LearningAug-1-2012

We describe a new optimization scheme for finding high-quality correlation clusterings in planar graphs that uses weighted perfect matching as a subroutine. Our method provides lower-bounds on the energy of the optimal correlation clustering that are typically fast to compute and tight in practice. We demonstrate our algorithm on the problem of image segmentation where this approach outperforms existing global optimization techniques in minimizing the objective and is competitive with the state of the art in producing high-quality segmentations.

artificial intelligence, constraint, optimization problem, (19 more...)

arXiv.org Machine Learning

1208.0378

Country:

North America > United States > California (0.14)
South America > Brazil > Rio de Janeiro (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Distributed Ontology Language (DOL): Use Cases, Syntax, and Extensibility

Lange, Christoph, Mossakowski, Till, Kutz, Oliver, Galinski, Christian, Grüninger, Michael, Vale, Daniel Couto

arXiv.org Artificial IntelligenceAug-1-2012

The Distributed Ontology Language (DOL) is currently being standardized within the OntoIOp (Ontology Integration and Interoperability) activity of ISO/TC 37/SC 3. It aims at providing a unified framework for (1) ontologies formalized in heterogeneous logics, (2) modular ontologies, (3) links between ontologies, and (4) annotation of ontologies. This paper presents the current state of DOL's standardization. It focuses on use cases where distributed ontologies enable interoperability and reusability. We demonstrate relevant features of the DOL syntax and semantics and explain how these integrate into existing knowledge engineering environments.

artificial intelligence, ontology, translation, (13 more...)

arXiv.org Artificial Intelligence

1208.0293

Country:

Europe > Germany > Bremen > Bremen (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > Canada > Alberta (0.14)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback

Learning a peptide-protein binding affinity predictor with kernel ridge regression

Giguère, Sébastien, Marchand, Mario, Laviolette, François, Drouin, Alexandre, Corbeil, Jacques

arXiv.org Machine LearningJul-31-2012

We propose a specialized string kernel for small bio-molecules, peptides and pseudo-sequences of binding interfaces. The kernel incorporates physico-chemical properties of amino acids and elegantly generalize eight kernels, such as the Oligo, the Weighted Degree, the Blended Spectrum, and the Radial Basis Function. We provide a low complexity dynamic programming algorithm for the exact computation of the kernel and a linear time algorithm for it's approximation. Combined with kernel ridge regression and SupCK, a novel binding pocket kernel, the proposed kernel yields biologically relevant and good prediction accuracy on the PepX database. For the first time, a machine learning predictor is capable of accurately predicting the binding affinity of any peptide to any protein. The method was also applied to both single-target and pan-specific Major Histocompatibility Complex class II benchmark datasets and three Quantitative Structure Affinity Model benchmark datasets. On all benchmarks, our method significantly (p-value < 0.057) outperforms the current state-of-the-art methods at predicting peptide-protein binding affinities. The proposed approach is flexible and can be applied to predict any quantitative biological activity. The method should be of value to a large segment of the research community with the potential to accelerate peptide-based drug and vaccine development.

health & medicine, immunology, kernel, (20 more...)

arXiv.org Machine Learning

doi: 10.1186/1471-2105-14-82

1207.7253

Country: North America > Canada > Ontario > Toronto (0.14)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.54)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.88)
Health & Medicine > Therapeutic Area > Vaccines (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Decision Making for Symbolic Probability

Giang, Phan H., Sandilya, Sathyakama

arXiv.org Artificial IntelligenceJul-31-2012

This paper proposes a decision theory for a symbolic generalization of probability theory (SP). Darwiche and Ginsberg [2,3] proposed SP to relax the requirement of using numbers for uncertainty while preserving desirable patterns of Bayesian reasoning. SP represents uncertainty by symbolic supports that are ordered partially rather than completely as in the case of standard probability. We show that a preference relation on acts that satisfies a number of intuitive postulates is represented by a utility function whose domain is a set of pairs of supports. We argue that a subjective interpretation is as useful and appropriate for SP as it is for numerical probability. It is useful because the subjective interpretation provides a basis for uncertainty elicitation. It is appropriate because we can provide a decision theory that explains how preference on acts is based on support comparison.

game theory, health & medicine, probability, (20 more...)

arXiv.org Artificial Intelligence

1207.4111

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Kansas > Douglas County > Lawrence (0.14)
North America > United States > California > San Mateo County (0.14)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)

Add feedback

PAC-Bayesian Inequalities for Martingales

Seldin, Yevgeny, Laviolette, François, Cesa-Bianchi, Nicolò, Shawe-Taylor, John, Auer, Peter

arXiv.org Machine LearningJul-30-2012

We present a set of high-probability inequalities that control the concentration of weighted averages of multiple (possibly uncountably many) simultaneously evolving and interdependent martingales. Our results extend the PAC-Bayesian analysis in learning theory from the i.i.d. setting to martingales opening the way for its application to importance weighted sampling, reinforcement learning, and other interactive learning domains, as well as many other domains in probability theory and statistics, where martingales are encountered. We also present a comparison inequality that bounds the expectation of a convex function of a martingale difference sequence shifted to the [0,1] interval by the expectation of the same function of independent Bernoulli variables. This inequality is applied to derive a tighter analog of Hoeffding-Azuma's inequality.

artificial intelligence, inequality, machine learning, (14 more...)

arXiv.org Machine Learning

1110.6886

Country:

Europe > Austria > Styria (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > California (0.14)
(2 more...)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Universally Consistent Latent Position Estimation and Vertex Classification for Random Dot Product Graphs

Sussman, Daniel L., Tang, Minh, Priebe, Carey E.

arXiv.org Machine LearningJul-29-2012

In this work we show that, using the eigen-decomposition of the adjacency matrix, we can consistently estimate latent positions for random dot product graphs provided the latent positions are i.i.d. from some distribution. If class labels are observed for a number of vertices tending to infinity, then we show that the remaining vertices can be classified with error converging to Bayes optimal using the $k$-nearest-neighbors classification rule. We evaluate the proposed methods on simulated data and a graph derived from Wikipedia.

artificial intelligence, social media, vertex, (17 more...)

arXiv.org Machine Learning

1207.6745

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Communications > Social Media (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.30)

Add feedback

High Dimensional Semiparametric Gaussian Copula Graphical Models

Liu, Han, Han, Fang, Yuan, Ming, Lafferty, John, Wasserman, Larry

arXiv.org Machine LearningJul-27-2012

In this paper, we propose a semiparametric approach, named nonparanormal skeptic, for efficiently and robustly estimating high dimensional undirected graphical models. To achieve modeling flexibility, we consider Gaussian Copula graphical models (or the nonparanormal) as proposed by Liu et al. (2009). To achieve estimation robustness, we exploit nonparametric rank-based correlation coefficient estimators, including Spearman's rho and Kendall's tau. In high dimensional settings, we prove that the nonparanormal skeptic achieves the optimal parametric rate of convergence in both graph and parameter estimation. This celebrating result suggests that the Gaussian copula graphical models can be used as a safe replacement of the popular Gaussian graphical models, even when the data are truly Gaussian. Besides theoretical analysis, we also conduct thorough numerical simulations to compare different estimators for their graph recovery performance under both ideal and noisy settings. The proposed methods are then applied on a large-scale genomic dataset to illustrate their empirical usefulness. The R language software package huge implementing the proposed methods is available on the Comprehensive R Archive Network: http://cran. r-project.org/.

artificial intelligence, estimator, health & medicine, (12 more...)

arXiv.org Machine Learning

1202.2169

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

Diversity in Ranking using Negative Reinforcement

Badrinath, Rama, Madhavan, C. E. Veni

arXiv.org Artificial IntelligenceJul-27-2012

In this paper, we consider the problem of diversity in ranking of the nodes in a graph. The task is to pick the top-k nodes in the graph which are both 'central' and 'diverse'. Many graph-based models of NLP like text summarization, opinion summarization involve the concept of diversity in generating the summaries. We develop a novel method which works in an iterative fashion based on random walks to achieve diversity. Specifically, we use negative reinforcement as a main tool to introduce diversity in the Personalized PageRank framework. Experiments on two benchmark datasets show that our algorithm is competitive to the existing methods.

artificial intelligence, information management, node, (19 more...)

arXiv.org Artificial Intelligence

1207.66

Country: Asia > India (0.29)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Identifying Users From Their Rating Patterns

Bento, José, Fawaz, Nadia, Montanari, Andrea, Ioannidis, Stratis

arXiv.org Machine LearningJul-26-2012

This paper reports on our analysis of the 2011 CAMRa Challenge dataset (Track 2) for context-aware movie recommendation systems. The train dataset comprises 4,536,891 ratings provided by 171,670 users on 23,974$ movies, as well as the household groupings of a subset of the users. The test dataset comprises 5,450 ratings for which the user label is missing, but the household label is provided. The challenge required to identify the user labels for the ratings in the test set. Our main finding is that temporal information (time labels of the ratings) is significantly more useful for achieving this objective than the user preferences (the actual ratings). Using a model that leverages on this fact, we are able to identify users within a known household with an accuracy of approximately 96% (i.e. misclassification rate around 4%).

artificial intelligence, household, machine learning, (17 more...)

arXiv.org Machine Learning

1207.6379

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback