AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Segmentation and Nodal Points in Narrative: Study of Multiple Variations of a Ballad

arXiv.org Machine LearningJun-7-2010

The Lady Maisry ballads afford us a framework within which to segment a storyline into its major components. Segments and as a consequence nodal points are discussed for nine different variants of the Lady Maisry story of a (young) woman being burnt to death by her family, on account of her becoming pregnant by a foreign personage. We motivate the importance of nodal points in textual and literary analysis. We show too how the openings of the nine variants can be analyzed comparatively, and also the conclusions of the ballads.

artificial intelligence, natural language, stanza, (18 more...)

arXiv.org Machine Learning

1006.1343

Country: North America > United States > New Mexico (0.14)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.31)

Add feedback

Rasch-based high-dimensionality data reduction and class prediction with applications to microarray gene expression data

Kastrin, Andrej, Peterlin, Borut

arXiv.org Artificial IntelligenceJun-5-2010

Class prediction is an important application of microarray gene expression data analysis. The high-dimensionality of microarray data, where number of genes (variables) is very large compared to the number of samples (obser- vations), makes the application of many prediction techniques (e.g., logistic regression, discriminant analysis) difficult. An efficient way to solve this prob- lem is by using dimension reduction statistical techniques. Increasingly used in psychology-related applications, Rasch model (RM) provides an appealing framework for handling high-dimensional microarray data. In this paper, we study the potential of RM-based modeling in dimensionality reduction with binarized microarray gene expression data and investigate its prediction ac- curacy in the context of class prediction using linear discriminant analysis. Two different publicly available microarray data sets are used to illustrate a general framework of the approach. Performance of the proposed method is assessed by re-randomization scheme using principal component analysis (PCA) as a benchmark method. Our results show that RM-based dimension reduction is as effective as PCA-based dimension reduction. The method is general and can be applied to the other high-dimensional data problems.

class prediction, health & medicine, oncology, (21 more...)

arXiv.org Artificial Intelligence

1006.103

Country:

Europe (0.68)
North America > United States (0.28)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.47)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Tree-Structured Stick Breaking Processes for Hierarchical Data

Adams, Ryan Prescott, Ghahramani, Zoubin, Jordan, Michael I.

arXiv.org Machine LearningJun-5-2010

Many data are naturally modeled by an unobserved hierarchical structure. In this paper we propose a flexible nonparametric prior over unknown data hierarchies. The approach uses nested stick-breaking processes to allow for trees of unbounded width and depth, where data can live at any node and are infinitely exchangeable. One can view our model as providing infinite mixtures where the components have a dependency structure corresponding to an evolutionary diffusion down a tree. By using a stick-breaking approach, we can apply Markov chain Monte Carlo methods based on slice sampling to perform Bayesian inference and simulate from the posterior distribution on trees. We apply our method to hierarchical clustering of images and topic modeling of text data.

artificial intelligence, bayesian inference, node, (18 more...)

arXiv.org Machine Learning

1006.1062

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Chi-square-based scoring function for categorization of MEDLINE citations

Kastrin, Andrej, Peterlin, Borut, Hristovski, Dimitar

arXiv.org Machine LearningJun-5-2010

Objectives: Text categorization has been used in biomedical informatics for identifying documents containing relevant topics of interest. We developed a simple method that uses a chi-square-based scoring function to determine the likelihood of MEDLINE citations containing genetic relevant topic. Methods: Our procedure requires construction of a genetic and a nongenetic domain document corpus. We used MeSH descriptors assigned to MEDLINE citations for this categorization task. We compared frequencies of MeSH descriptors between two corpora applying chi-square test. A MeSH descriptor was considered to be a positive indicator if its relative observed frequency in the genetic domain corpus was greater than its relative observed frequency in the nongenetic domain corpus. The output of the proposed method is a list of scores for all the citations, with the highest score given to those citations containing MeSH descriptors typical for the genetic domain. Results: Validation was done on a set of 734 manually annotated MEDLINE citations. It achieved predictive accuracy of 0.87 with 0.69 recall and 0.64 precision. We evaluated the method by comparing it to three machine learning algorithms (support vector machines, decision trees, na\"ive Bayes). Although the differences were not statistically significantly different, results showed that our chi-square scoring performs as good as compared machine learning algorithms. Conclusions: We suggest that the chi-square scoring is an effective solution to help categorize MEDLINE citations. The algorithm is implemented in the BITOLA literature-based discovery support system as a preprocessor for gene symbol disambiguation process.

algorithm, health & medicine, text processing, (19 more...)

arXiv.org Machine Learning

1006.1029

Country:

Europe > North Macedonia > Pelagonia Statistical Region > Bitola Municipality > Bitola (0.25)
North America > United States > Massachusetts (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)

Add feedback

Chromatic PAC-Bayes Bounds for Non-IID Data: Applications to Ranking and Stationary $\beta$-Mixing Processes

Ralaivola, Liva, Szafranski, Marie, Stempfel, Guillaume

arXiv.org Machine LearningJun-4-2010

Pac-Bayes bounds are among the most accurate generalization bounds for classifiers learned from independently and identically distributed (IID) data, and it is particularly so for margin classifiers: there have been recent contributions showing how practical these bounds can be either to perform model selection (Ambroladze et al., 2007) or even to directly guide the learning of linear classifiers (Germain et al., 2009). However, there are many practical situations where the training data show some dependencies and where the traditional IID assumption does not hold. Stating generalization bounds for such frameworks is therefore of the utmost interest, both from theoretical and practical standpoints. In this work, we propose the first - to the best of our knowledge - Pac-Bayes generalization bounds for classifiers trained on data exhibiting interdependencies. The approach undertaken to establish our results is based on the decomposition of a so-called dependency graph that encodes the dependencies within the data, in sets of independent data, thanks to graph fractional covers. Our bounds are very general, since being able to find an upper bound on the fractional chromatic number of the dependency graph is sufficient to get new Pac-Bayes bounds for specific settings. We show how our results can be used to derive bounds for ranking statistics (such as Auc) and classifiers trained on data distributed according to a stationary {\ss}-mixing process. In the way, we show how our approach seemlessly allows us to deal with U-processes. As a side note, we also provide a Pac-Bayes generalization bound for classifiers learned on data from stationary $\varphi$-mixing distributions.

artificial intelligence, bayesian inference, pac-baye, (16 more...)

arXiv.org Machine Learning

0909.1933

Country:

Europe > France (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Empirical learning aided by weak domain knowledge in the form of feature importance

Iqbal, Ridwan Al

arXiv.org Artificial IntelligenceJun-3-2010

Standard hybrid learners that use domain knowledge require stronger knowledge that is hard and expensive to acquire. However, weaker domain knowledge can benefit from prior knowledge while being cost effective. Weak knowledge in the form of feature relative importance (FRI) is presented and explained. Feature relative importance is a real valued approximation of a feature's importance provided by experts. Advantage of using this knowledge is demonstrated by IANN, a modified multilayer neural network algorithm. IANN is a very simple modification of standard neural network algorithm but attains significant performance gains. Experimental results in the field of molecular biology show higher performance over other empirical learning algorithms including standard backpropagation and support vector machines. IANN performance is even comparable to a theory refinement system KBANN that uses stronger domain knowledge. This shows Feature relative importance can improve performance of existing empirical learning algorithms significantly with minimal effort.

inductive learning, knowledge, neural network, (19 more...)

arXiv.org Artificial Intelligence

1005.5556

Country: North America > United States > California (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)

Add feedback

Information theoretic model validation for clustering

Buhmann, Joachim M.

arXiv.org Machine LearningJun-2-2010

Model selection in clustering requires (i) to specify a suitable clustering principle and (ii) to control the model order complexity by choosing an appropriate number of clusters depending on the noise level in the data. We advocate an information theoretic perspective where the uncertainty in the measurements quantizes the set of data partitionings and, thereby, induces uncertainty in the solution space of clusterings. A clustering model, which can tolerate a higher level of fluctuations in the measurements than alternative models, is considered to be superior provided that the clustering solution is equally informative. This tradeoff between \emph{informativeness} and \emph{robustness} is used as a model selection criterion. The requirement that data partitionings should generalize from one data set to an equally probable second data set gives rise to a new notion of structure induced information.

approximation, artificial intelligence, machine learning, (15 more...)

arXiv.org Machine Learning

1006.0375

Country: North America > United States (0.69)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Structured Variable Selection with Sparsity-Inducing Norms

Jenatton, Rodolphe, Audibert, Jean-Yves, Bach, Francis

arXiv.org Machine LearningMay-31-2010

We consider the empirical risk minimization problem for linear supervised learning, with regularization by structured sparsity-inducing norms. These are defined as sums of Euclidean norms on certain subsets of variables, extending the usual $\ell_1$-norm and the group $\ell_1$-norm by allowing the subsets to overlap. This leads to a specific set of allowed nonzero patterns for the solutions of such problems. We first explore the relationship between the groups defining the norm and the resulting nonzero patterns, providing both forward and backward algorithms to go back and forth from groups to patterns. This allows the design of norms adapted to specific prior knowledge expressed in terms of nonzero patterns. We also present an efficient active set algorithm, and analyze the consistency of variable selection for least-squares linear regression in low and high-dimensional settings.

health & medicine, nonzero pattern, optimization problem, (17 more...)

arXiv.org Machine Learning

0904.3523

Country: Europe > France (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (0.45)
Health & Medicine > Diagnostic Medicine > Imaging (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Using a Kernel Adatron for Object Classification with RCS Data

Byl, Marten F., Demers, James T., Rietman, Edward A.

arXiv.org Machine LearningMay-28-2010

Rapid identification of object from radar cross section (RCS) signals is important for many space and military applications. This identification is a problem in pattern recognition which either neural networks or support vector machines should prove to be high-speed. Bayesian networks would also provide value but require significant preprocessing of the signals. In this paper, we describe the use of a support vector machine for object identification from synthesized RCS data. Our best results are from data fusion of X-band and S-band signals, where we obtained 99.4%, 95.3%, 100% and 95.6% correct identification for cylinders, frusta, spheres, and polygons, respectively. We also compare our results with a Bayesian approach and show that the SVM is three orders of magnitude faster, as measured by the number of floating point operations.

artificial intelligence, bayesian inference, vector, (18 more...)

arXiv.org Machine Learning

1005.5337

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Combining Naive Bayes and Decision Tree for Adaptive Intrusion Detection

Farid, Dewan Md., Harbi, Nouria, Rahman, Mohammad Zahidur

arXiv.org Artificial IntelligenceMay-25-2010

In this paper, a new learning algorithm for adaptive network intrusion detection using naive Bayesian classifier and decision tree is presented, which performs balance detections and keeps false positives at acceptable level for different types of network attacks, and eliminates redundant attributes as well as contradictory examples from training data that make the detection model complex. The proposed algorithm also addresses some difficulties of data mining such as handling continuous attribute, dealing with missing attribute values, and reducing noise in training data. Due to the large volumes of security audit data as well as the complex and dynamic properties of intrusion behaviours, several data miningbased intrusion detection techniques have been applied to network-based traffic data and host-based data in the last decades. However, there remain various issues needed to be examined towards current intrusion detection systems (IDS). We tested the performance of our proposed algorithm with existing learning algorithms by employing on the KDD99 benchmark intrusion detection dataset. The experimental results prove that the proposed algorithm achieved high detection rates (DR) and significant reduce false positives (FP) for different types of network intrusions using limited computational resources.

dataset, law enforcement, public safety, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.5121/ijnsa.2010.2202

1005.4496

Country:

North America > United States > Georgia (0.14)
North America > United States > California (0.14)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback