AITopics

1208.2278

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Technology:

Information Technology > Communications > Networks > Sensor Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Giguère, Sébastien, Marchand, Mario, Laviolette, François, Drouin, Alexandre, Corbeil, Jacques

Learning a peptide-protein binding affinity predictor with kernel ridge regression

arXiv.org Machine LearningJul-31-2012

We propose a specialized string kernel for small bio-molecules, peptides and pseudo-sequences of binding interfaces. The kernel incorporates physico-chemical properties of amino acids and elegantly generalize eight kernels, such as the Oligo, the Weighted Degree, the Blended Spectrum, and the Radial Basis Function. We provide a low complexity dynamic programming algorithm for the exact computation of the kernel and a linear time algorithm for it's approximation. Combined with kernel ridge regression and SupCK, a novel binding pocket kernel, the proposed kernel yields biologically relevant and good prediction accuracy on the PepX database. For the first time, a machine learning predictor is capable of accurately predicting the binding affinity of any peptide to any protein. The method was also applied to both single-target and pan-specific Major Histocompatibility Complex class II benchmark datasets and three Quantitative Structure Affinity Model benchmark datasets. On all benchmarks, our method significantly (p-value < 0.057) outperforms the current state-of-the-art methods at predicting peptide-protein binding affinities. The proposed approach is flexible and can be applied to predict any quantitative biological activity. The method should be of value to a large segment of the research community with the potential to accelerate peptide-based drug and vaccine development.

artificial intelligence, kernel, machine learning, (16 more...)

doi: 10.1186/1471-2105-14-82

1207.7253

Country: North America > Canada > Ontario (0.28)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.54)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.88)
Health & Medicine > Therapeutic Area > Vaccines (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Machine LearningJul-27-2012

High Dimensional Semiparametric Gaussian Copula Graphical Models

Liu, Han, Han, Fang, Yuan, Ming, Lafferty, John, Wasserman, Larry

In this paper, we propose a semiparametric approach, named nonparanormal skeptic, for efficiently and robustly estimating high dimensional undirected graphical models. To achieve modeling flexibility, we consider Gaussian Copula graphical models (or the nonparanormal) as proposed by Liu et al. (2009). To achieve estimation robustness, we exploit nonparametric rank-based correlation coefficient estimators, including Spearman's rho and Kendall's tau. In high dimensional settings, we prove that the nonparanormal skeptic achieves the optimal parametric rate of convergence in both graph and parameter estimation. This celebrating result suggests that the Gaussian copula graphical models can be used as a safe replacement of the popular Gaussian graphical models, even when the data are truly Gaussian. Besides theoretical analysis, we also conduct thorough numerical simulations to compare different estimators for their graph recovery performance under both ideal and noisy settings. The proposed methods are then applied on a large-scale genomic dataset to illustrate their empirical usefulness. The R language software package huge implementing the proposed methods is available on the Comprehensive R Archive Network: http://cran. r-project.org/.

artificial intelligence, estimator, machine learning, (12 more...)

1202.2169

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Bento, José, Fawaz, Nadia, Montanari, Andrea, Ioannidis, Stratis

Identifying Users From Their Rating Patterns

arXiv.org Machine LearningJul-26-2012

This paper reports on our analysis of the 2011 CAMRa Challenge dataset (Track 2) for context-aware movie recommendation systems. The train dataset comprises 4,536,891 ratings provided by 171,670 users on 23,974$ movies, as well as the household groupings of a subset of the users. The test dataset comprises 5,450 ratings for which the user label is missing, but the household label is provided. The challenge required to identify the user labels for the ratings in the test set. Our main finding is that temporal information (time labels of the ratings) is significantly more useful for achieving this objective than the user preferences (the actual ratings). Using a model that leverages on this fact, we are able to identify users within a known household with an accuracy of approximately 96% (i.e. misclassification rate around 4%).

artificial intelligence, household, machine learning, (17 more...)

1207.6379

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Cost-Sensitive Risk Stratification in the Diagnosis of Heart Disease

Uguroglu, Selen (Carnegie Mellon University) | Doyle, Mark (Allegheny General Hospital) | Biederman, Robert (Allegheny General Hospital) | Carbonell, Jaime (Carnegie Mellon University)

We investigate machine learning methods for diagnostic screening of heart disease. Coronary heart disease is the leading cause of death in the US, causing more deaths than all types of cancers combined. Early diagnosis of heart disease in women is harder than it is in men and typically requires the administration of several clinical tests on the patient. Most risk stratification methods aggregate the results of such tests, including the risky, invasive procedures that cannot be administered on all patients. In this paper, our goal is to identify patients who are under high-risk of having heart disease and related adverse events, using a minimal number of diagnostic tests, especially less invasive ones. The low frequency of patients with severe heart disease in the dataset is challenging for most conventional machine learning methods. To overcome this problem, we develop and apply a cost-sensitive k nearest neighbor algorithm. Our contributions are two fold: First, we compare the predictive value of several diagnostic procedures for heart disease, including electrocardiography, angiography, radionuclide perfusion and conclude that in womens heart disease, certain combinations of non-invasive techniques are more predictive than some of the widely used invasive procedures. Then, we evaluate held out data and achieve an AUROC over 0.70, signifying valuable clinical utility, using only the least costly and least invasive tests.

diagnostic test, heart disease, procedure, (16 more...)

Twenty-Fourth IAAI Conference

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.89)

Jung, Hyun Joon (University of Texas at Austin) | Lease, Matthew (University of Texas at Austin)

Improving Quality of Crowdsourced Labels via Probabilistic Matrix Factorization

In crowdsourced relevance judging, each crowd workertypically judges only a small number of examples,yielding a sparse and imbalanced set of judgments inwhich relatively few workers influence output consensuslabels, particularly with simple consensus methodslike majority voting. We show how probabilistic matrixfactorization, a standard approach in collaborative filtering,can be used to infer missing worker judgments suchthat all workers influence output labels. Given completeworker judgments inferred by PMF, we evaluate impactin unsupervised and supervised scenarios. In thesupervised case, we consider both weighted voting andworker selection strategies based on worker accuracy.Experiments on a synthetic data set and a real turk dataset with crowd judgments from the 2010 TREC RelevanceFeedback Track show promise of the PMF approachmerits further investigation and analysis.

judgment, pmf, voting, (15 more...)

Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas > Travis County > Austin (0.04)

Genre:

Research Report > Experimental Study (0.47)
Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Karapinar, Sertac (Istanbul Technical University) | Altan, Dogan (Istanbul Technical University) | Sariel-Talay, Sanem (Istanbul Technical University)

A Robust Planning Framework for Cognitive Robots

A cognitive robot should construct a plan to attain its goals. While it executes the actions in its plan, it may face several failures due to both internal and external issues. We present a taxonomy to classify these failures that may be encountered during the execution of cognitive tasks. The taxonomy presents a wide range of failure types. To recover from most of these failures presented in this taxonomy, we propose a Robust Planning Framework for cognitive robots. Our framework combines planning, reasoning and learning procedures into each other for robust execution of cognitive tasks. Failures can be detected and handled by reasoning and replanning, respectively. The framework also facilitates learning new hypotheses incrementally based on experience. It can successfully detect and recover from temporary failures on a selected set of actions executed by a Pioneer3DX robot. It has been shown that our preliminary results for hypothesis learning in failure scenarios are promising.

execution, hypothesis, robot, (12 more...)

Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Random Projection with Filtering for Nearly Duplicate Search

Lin, Yue (Zhejiang University) | Jin, Rong (Michigan State University) | Cai, Deng (Zhejiang University) | He, Xiaofei (Zhejiang University)

High dimensional nearest neighbor search is a fundamental problem and has found applications in many domains. Although many hashing based approaches have been proposed for approximate nearest neighbor search in high dimensional space, one main drawback is that they often return many false positives that need to be filtered out by a post procedure. We propose a novel method to address this limitation in this paper. The key idea is to introduce a filtering procedure within the search algorithm, based on the compressed sensing theory, that effectively removes the false positive answers. We first obtain a sparse representation for each data point by the landmark based approach, after which we solve the nearly duplicate search that the difference between the query and its nearest neighbors forms a sparse vector living in a small ℓp ball, where p ≤ 1. Our empirical study on real-world datasets demonstrates the effectiveness of the proposed approach compared to the state-of-the-art hashing methods.

algorithm, representation, sparse representation, (13 more...)

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.05)
North America > United States > Michigan > Ingham County > Lansing (0.04)
North America > United States > Michigan > Ingham County > East Lansing (0.04)
(2 more...)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)

Jang, Myungha (Pohang University of Science and Technology (POSTECH)) | Park, Jin-woo (Pohang University of Science and Technology (POSTECH)) | Hwang, Seung-won (Pohang University of Science and Technology (POSTECH))

Predictive Mining of Comparable Entities from the Web

Comparing entities is an important part of decision making. Several approaches have been reported for mining comparable entities from Web sources to improve user experience in comparing entities online.However, these efforts extract only entities explicitly compared in the corpora, and may exclude entities that occur less-frequently but potentially comparable. To build a more complete comparison machine that can infer such missing relations, here we develop a solutionto predict transitivity of known comparable relations. Named CliqueGrow, our approach predicts missing links given a comparable entity graph obtained from versus query logs. Our approach achieved the highest F1-score among five link prediction approaches and a commercial comparison engine provided by Yahoo!.

algorithm, comparable entity, node, (15 more...)

Twenty-Sixth AAAI Conference on Artificial Intelligence

Country: Asia > South Korea > Gyeongsangbuk-do > Pohang (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Information Management > Search (0.91)
Information Technology > Communications (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

The Impact of Personalization on Smartphone-Based Activity Recognition

Weiss, Gary Mitchell (Fordham University) | Lockhart, Jeffrey (Fordham University)

Smartphones incorporate many diverse and powerful sensors, which creates exciting new opportunities for data mining and human-computer interaction. In this paper we show how standard classification algorithms can use labeled smartphone-based accelerometer data to identify the physical activity a user is performing. Our main focus is on evaluating the relative performance of impersonal and personal activity recognition models. Our impersonal (i.e., universal) models are built using training data from a panel of users and are then applied to new users, while our personal models are built with data from each user and then applied only to new data from that user. Our results indicate that the personal models perform dramatically better than the impersonal models—even when trained from only a few minutes worth of data. These personal models typically even outperform hybrid models that utilize both personal and impersonal data. These results strongly argue for the construction of personal models whenever possible. Our research means that we can unobtrusively gain useful knowledge about the habits of potentially millions of users. It also means that we can facilitate human computer interaction by enabling the smartphone to consider context and this can lead to new and more effective applications.

artificial intelligence, human computer interaction, machine learning, (12 more...)

Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (0.93)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)
Health & Medicine > Consumer Health (0.66)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
(2 more...)