AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

ProDiGe: PRioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples

arXiv.org Machine LearningJun-1-2011

Elucidating the genetic basis of human diseases is a central goal of genetics and molecular biology. While traditional linkage analysis and modern high-throughput techniques often provide long lists of tens or hundreds of disease gene candidates, the identification of disease genes among the candidates remains time-consuming and expensive. Efficient computational methods are therefore needed to prioritize genes within the list of candidates, by exploiting the wealth of information available about the genes in various databases. Here we propose ProDiGe, a novel algorithm for Prioritization of Disease Genes. ProDiGe implements a novel machine learning strategy based on learning from positive and unlabeled examples, which allows to integrate various sources of information about the genes, to share information about known disease genes across diseases, and to perform genome-wide searches for new disease genes. Experiments on real data show that ProDiGe outperforms state-of-the-art methods for the prioritization of genes in human diseases.

artificial intelligence, information, machine learning, (19 more...)

arXiv.org Machine Learning

1106.0134

Country: North America > United States (1.00)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Therapeutic Area > Genetic Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Learning Hierarchical Sparse Representations using Iterative Dictionary Learning and Dimension Reduction

Tarifi, Mohamad, Sitharam, Meera, Ho, Jeffery

arXiv.org Artificial IntelligenceJun-1-2011

Working towards a Computational Theory of Intelligence, we develop a computational framework inspired by ideas from Neuroscience. Specifically, we integrate notions of columnar organization, hierarchical structure, sparse distributed representations, and sparse coding. An integrated view of Intelligence has been proptosed by Karl Friston based on free-energy [13, 11, 8, 9, 10, 12]. In this framework, Intelligence is viewed as a surrogate minimization of the entropy of this sensorium. This work is intuitively inspired by this view, aiming to provide a computational foundation for a theory of intelligence from the perspective of theoretical computer science, thereby connecting to ideas in mathematics. By building foundations for a principled approach, the computational essence of problems can be isolated, formalized, and their relationship to fundamental problems in mathematics and theoretical computer science can be illuminated and the full power of available mathematical techniques can be brought to bear. A computational approach is focused on developing tractable algorithms.

artificial intelligence, machine learning, representation, (14 more...)

arXiv.org Artificial Intelligence

1106.0357

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

A Model of Inductive Bias Learning

Baxter, J.

arXiv.org Artificial IntelligenceJun-1-2011

A major problem in machine learning is that of inductive bias: how to choose a learner's hypothesis space so that it is large enough to contain a solution to the problem being learnt, yet small enough to ensure reliable generalization from reasonably-sized training sets. Typically such bias is supplied by hand through the skill and insights of experts. In this paper a model for automatically learning bias is investigated. The central assumption of the model is that the learner is embedded within an environment of related learning tasks. Within such an environment the learner can sample from multiple tasks, and hence it can search for a hypothesis space that contains good solutions to many of the problems in the environment. Under certain restrictions on the set of all hypothesis spaces available to the learner, we show that a hypothesis space that performs well on a sufficiently large number of training tasks will also perform well when learning novel tasks in the same environment. Explicit bounds are also derived demonstrating that learning multiple tasks within an environment of related tasks can potentially give much better generalization than learning a single task.

artificial intelligence, learner, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.731

1106.0245

Country: North America > United States (0.92)

Genre: Research Report (0.81)

Industry:

Education (0.68)
Health & Medicine > Diagnostic Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Add feedback

Identifying Mislabeled Training Data

Brodley, C. E., Friedl, M. A.

arXiv.org Artificial IntelligenceJun-1-2011

The goal of this approach is to improve classication accuracies produced by learning algorithms by improving the quality of the training data. Our approach uses a set of learning algorithms to create classiers that serve as noise lters for the training data. We evaluate single algorithm, majority vote and consensus lters on ve datasets that are prone to labeling errors. Our experiments illustrate that ltering signicantly improves classication accuracy for noise levels up to 30%. An analytical and empirical evaluation of the precision of our approach shows that consensus lters are conservative at throwing away good data at the expense of retaining bad data and that majority lters are better at detecting bad data at the expense of throwing away good data. This suggests that for situations in which there is a paucity of data, consensus lters are preferable, whereas majority vote lters are preferable for situations with an abundance of data. 1. Introducti The maximum accuracy achievable depends on the quality of the data and on the appropriateness of the chosen learning algorithm for the data. The work described here focuses on improving the quality of training data by identifying and eliminating mislabeled instances prior to applying the chosen learning algorithm, thereby increasing classication accuracy. Labeling error can occur for several reasons including subjectivity, data-entry error, or inadequacy of the information used to label each object. Subjectivity may arise when observations need to be ranked in some way such as disease severity or when the information used to label an object is dierent from the information to which the learning algorithm will have access. For example, when labeling pixels in image data, the analyst typically uses visual input rather than the numeric values of the feature vector corresponding to the observation. Domains in which experts disagree are natural places for subjective labeling errors (Smyth, 1996). A third cause of labeling error arises when the information used to label each observation is inadequate. For example, in the medical domain it may not be possible to perform the tests necessary to guarantee that a diagnosis is 100% accurate. For domains in which labeling errors occur, an automated method of eliminating or correcting mislabeled observations will improve the predictive accuracy of the classier formed from the training data. In this article we address the problem of identifying training instances that are mislabeled.

artificial intelligence, classier, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.606

1106.0219

Country: North America > United States > Massachusetts (0.46)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.68)

Industry:

Energy (0.68)
Education (0.67)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Issues in Stacked Generalization

Ting, K. M., Witten, I. H.

arXiv.org Artificial IntelligenceMay-26-2011

Stacked generalization is a general method of using a high-level model to combine lower-level models to achieve greater predictive accuracy. In this paper we address two crucial issues which have been considered to be a `black art' in classification tasks ever since the introduction of stacked generalization in 1992 by Wolpert: the type of generalizer that is suitable to derive the higher-level model, and the kind of attributes that should be used as its input. We find that best results are obtained when the higher-level model combines the confidence (and not just the predictions) of the lower-level ones. We demonstrate the effectiveness of stacked generalization for combining three different types of learning algorithms for classification tasks. We also compare the performance of stacked generalization with majority vote and published results of arcing and bagging.

artificial intelligence, decision tree learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1613/jair.594

1105.5466

Country: North America > United States > California (0.46)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

b-Bit Minwise Hashing for Large-Scale Linear SVM

Li, Ping, Moore, Joshua, Konig, Christian

arXiv.org Machine LearningMay-22-2011

In this paper, we propose to (seamlessly) integrate b-bit minwise hashing with linear SVM to substantially improve the training (and testing) efficiency using much smaller memory, with essentially no loss of accuracy. Theoretically, we prove that the resemblance matrix, the minwise hashing matrix, and the b-bit minwise hashing matrix are all positive definite matrices (kernels). Interestingly, our proof for the positive definiteness of the b-bit minwise hashing kernel naturally suggests a simple strategy to integrate b-bit hashing with linear SVM. Our technique is particularly useful when the data can not fit in memory, which is an increasingly critical issue in large-scale machine learning. Our preliminary experimental results on a publicly available webspam dataset (350K samples and 16 million dimensions) verified the effectiveness of our algorithm. For example, the training time was reduced to merely a few seconds. In addition, our technique can be easily extended to many other linear and nonlinear machine learning applications such as logistic regression.

artificial intelligence, machine learning, webspam, (16 more...)

arXiv.org Machine Learning

1105.4385

Country:

Europe (1.00)
North America > Canada (0.94)
North America > United States > California > Santa Clara County (0.46)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Behavior of Graph Laplacians on Manifolds with Boundary

Zhou, Xueyuan, Belkin, Mikhail

arXiv.org Machine LearningMay-19-2011

In manifold learning, algorithms based on graph Laplacians constructed from data have received considerable attention both in practical applications and theoretical analysis. In particular, the convergence of graph Laplacians obtained from sampled data to certain continuous operators has become an active research topic recently. Most of the existing work has been done under the assumption that the data is sampled from a manifold without boundary or that the functions of interests are evaluated at a point away from the boundary. However, the question of boundary behavior is of considerable practical and theoretical interest. In this paper we provide an analysis of the behavior of graph Laplacians at a point near or on the boundary, discuss their convergence rates and their implications and provide some numerical results. It turns out that while points near the boundary occupy only a small part of the total volume of a manifold, the behavior of graph Laplacian there has different scaling properties from its behavior elsewhere on the manifold, with global effects on the whole manifold, an observation with potentially important implications for the general problem of learning on manifolds.

boundary, graph laplacian, laplacian, (12 more...)

arXiv.org Machine Learning

1105.3931

Country:

North America > United States > Ohio (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

When Optimal Is Just Not Good Enough: Learning Fast Informative Action Cost Partitionings

Karpas, Erez (Technion) | Katz, Michael (Technion) | Markovitch, Shaul (Technion)

AAAI ConferencesMay-18-2011

Several recent heuristics for domain independent planning adopt some action cost partitioning scheme to derive admissible heuristic estimates. Given a state, two methods for obtaining an action cost partitioning have been proposed: optimal cost partitioning, which results in the best possible heuristic estimate for that state, but requires a substantial computational effort, and ad-hoc (uniform) cost partitioning, which is much faster, but is usually less informative. These two methods represent almost opposite points in the tradeoff between heuristic accuracy and heuristic computation time. One compromise that has been proposed between these two is using an optimal cost partitioning for the initial state to evaluate all states. In this paper, we propose a novel method for deriving a fast, informative cost-partitioning scheme, that is based on computing optimal action cost partitionings for a small set of states, and using these to derive heuristic estimates for all states. Our method provides greater control over the accuracy/computation-time tradeoff, which, as our empirical evaluation shows, can result in better performance.

optimal cost, planning task, state space, (15 more...)

AAAI Conferences

Twenty-First International Conference on Automated Planning and Scheduling

Country:

Asia > Middle East > Israel (0.04)
Oceania > Australia > Queensland > Townsville (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.69)
Information Technology > Data Science (0.68)

Add feedback

Impact of Word Sense Disambiguation on Ordering Dictionary Definitions in Vocabulary Learning Tutors

Rosa, Kevin Dela (Carnegie Mellon University) | Eskenazi, Maxine (Carnegie Mellon University)

AAAI ConferencesMay-18-2011

Past research has shown that dictionaries and glosses can be beneficial in computer assisted language learning, particularly in vocabulary learning. We propose that L2 vocabulary learners can benefit from the use of a dictionary whose definitions are sensitive to the provided reading context, and that advances in the natural language processing task of word sense disambiguation can be used to automatically order the definitions of such a dictionary. An in-vivo study was conducted with ESL students to investigate the effect that the order of definitions has on vocabulary learning using REAP, a computer based vocabulary tutor. Our results showed that students benefited from having the algorithmically determined best definitions listed at the top of the definition list. Furthermore, our results suggest that word sense disambiguation may currently be good enough for use in intelligent language tutoring environments.

classifier, dictionary definition, student, (11 more...)

AAAI Conferences

Twenty-Fourth International FLAIRS Conference

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.30)

Add feedback

Utility Driven Clustering

Raj, Prabakararaj Swapna (Indian Institute of Technology Madras) | Ravindran, Balaraman (Indian Institute of Technology Madras)

AAAI ConferencesMay-18-2011

Data mining has primarily focused on statistical properties of data alone and not necessarily on what could be done with the patterns. While there has been some work on measuring usefulness of patterns in decision making but not on using such measures for driving the mining process. We introduce a framework to mine clusters that support decision making. We use an extrinsic measure that evaluates patterns based on their utility in decision making. We show empirical validationof our approach on several test domains.

dataset, evaluation, utility function, (13 more...)

AAAI Conferences

Twenty-Fourth International FLAIRS Conference

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.32)

Add feedback