AITopics | Performance Analysis

Collaborating Authors

Performance Analysis

News Overviews Instructional Materials AI-Alerts Classics

Penalized Likelihood Methods for Estimation of Sparse High Dimensional Directed Acyclic Graphs

arXiv.org Machine LearningNov-28-2009

Directed acyclic graphs (DAGs) are commonly used to represent causal relationships among random variables in graphical models. Applications of these models arise in the study of physical, as well as biological systems, where directed edges between nodes represent the influence of components of the system on each other. The general problem of estimating DAGs from observed data is computationally NP-hard, Moreover two directed graphs may be observationally equivalent. When the nodes exhibit a natural ordering, the problem of estimating directed graphs reduces to the problem of estimating the structure of the network. In this paper, we propose a penalized likelihood approach that directly estimates the adjacency matrix of DAGs. Both lasso and adaptive lasso penalties are considered and an efficient algorithm is proposed for estimation of high dimensional DAGs. We study variable selection consistency of the two penalties when the number of variables grows to infinity with the sample size. We show that although lasso can only consistently estimate the true network under stringent assumptions, adaptive lasso achieves this task under mild regularity conditions. The performance of the proposed methods is compared to alternative methods in simulated, as well as real, data examples.

artificial intelligence, dag, machine learning, (18 more...)

arXiv.org Machine Learning

0911.5439

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Cross-lingual Annotation Projection for Semantic Roles

Pado, S., Lapata, M.

Journal of Artificial Intelligence ResearchNov-17-2009

This article considers the task of automatically inducing role-semantic annotations in the FrameNet paradigm for new languages. We propose a general framework that is based on annotation projection, phrased as a graph optimization problem. It is relatively inexpensive and has the potential to reduce the human effort involved in creating role-semantic resources. Within this framework, we present projection models that exploit lexical and syntactic information. We provide an experimental evaluation on an English-German parallel corpus which demonstrates the feasibility of inducing high-precision German semantic role annotation both for manually and automatically annotated English data.

alignment, proceedings, projection, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2863

AI Access Foundation

10629

Journal of Artificial Intelligence Research

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
(41 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

Feature-Weighted Linear Stacking

Sill, Joseph, Takacs, Gabor, Mackey, Lester, Lin, David

arXiv.org Artificial IntelligenceNov-4-2009

Ensemble methods, such as stacking, are designed to boost predictive accuracy by blending the predictions of multiple machine learning models. Recent work has shown that the use of meta-features, additional inputs describing each example in a dataset, can boost the performance of ensemble methods, but the greatest reported gains have come from nonlinear procedures requiring significant tuning and training time. Here, we present a linear technique, Feature-Weighted Linear Stacking (FWLS), that incorporates meta-features for improved accuracy while retaining the well-known virtues of linear regression regarding speed, stability, and interpretability. FWLS combines model predictions linearly using coefficients that are themselves linear functions of meta-features. This technique was a key facet of the solution of the second place team in the recently concluded Netflix Prize competition. Significant increases in accuracy over standard linear stacking are demonstrated on the Netflix Prize collaborative filtering dataset.

artificial intelligence, machine learning, movie, (17 more...)

arXiv.org Artificial Intelligence

0911.0460

Country:

South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > Hungary > Győr-Moson-Sopron County > Győr (0.04)

Genre:

Research Report (0.82)
Contests & Prizes (0.67)
Personal > Honors (0.46)

Industry: Media > Film (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Algorithms for Image Analysis and Combination of Pattern Classifiers with Application to Medical Diagnosis

Georgiou, Harris

arXiv.org Artificial IntelligenceOct-17-2009

Medical Informatics and the application of modern signal processing in the assistance of the diagnostic process in medical imaging is one of the more recent and active research areas today. This thesis addresses a variety of issues related to the general problem of medical image analysis, specifically in mammography, and presents a series of algorithms and design approaches for all the intermediate levels of a modern system for computer-aided diagnosis (CAD). The diagnostic problem is analyzed with a systematic approach, first defining the imaging characteristics and features that are relevant to probable pathology in mammo-grams. Next, these features are quantified and fused into new, integrated radio-logical systems that exhibit embedded digital signal processing, in order to improve the final result and minimize the radiological dose for the patient. In a higher level, special algorithms are designed for detecting and encoding these clinically interest-ing imaging features, in order to be used as input to advanced pattern classifiers and machine learning models. Finally, these approaches are extended in multi-classifier models under the scope of Game Theory and optimum collective deci-sion, in order to produce efficient solutions for combining classifiers with minimum computational costs for advanced diagnostic systems. The material covered in this thesis is related to a total of 18 published papers, 6 in scientific journals and 12 in international conferences.

artificial intelligence, classifier, machine learning, (18 more...)

arXiv.org Artificial Intelligence

0910.3348

Country:

Europe > Netherlands > South Holland > Delft (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Norway (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.32)

Add feedback

Mean-Field Theory of Meta-Learning

Plewczynski, Dariusz

arXiv.org Machine LearningOct-9-2009

We discuss here the mean-field theory for a cellular automata model of meta-learning. The meta-learning is the process of combining outcomes of individual learning procedures in order to determine the final decision with higher accuracy than any single learning method. Our method is constructed from an ensemble of interacting, learning agents, that acquire and process incoming information using various types, or different versions of machine learning algorithms. The abstract learning space, where all agents are located, is constructed here using a fully connected model that couples all agents with random strength values. The cellular automata network simulates the higher level integration of information acquired from the independent learning trials. The final classification of incoming input data is therefore defined as the stationary state of the meta-learning system using simple majority rule, yet the minority clusters that share opposite classification outcome can be observed in the system. Therefore, the probability of selecting proper class for a given input data, can be estimated even without the prior knowledge of its affiliation. The fuzzy logic can be easily introduced into the system, even if learning agents are build from simple binary classification machine learning algorithms by calculating the percentage of agreeing agents.

agent, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1088/1742-5468/2009/11/P11003

0907.4643

Country:

Europe > Poland > Masovia Province > Warsaw (0.04)
North America > United States (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > France (0.04)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.94)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

SparseCodePicking: feature extraction in mass spectrometry using sparse coding algorithms

Alexandrov, Theodore, Steinhorst, Klaus, Keszoecze, Oliver, Schiffler, Stefan

arXiv.org Machine LearningOct-5-2009

Mass spectrometry (MS) is an important technique for chemical profiling which calculates for a sample a high dimensional histogram-like spectrum. A crucial step of MS data processing is the peak picking which selects peaks containing information about molecules with high concentrations which are of interest in an MS investigation. We present a new procedure of the peak picking based on a sparse coding algorithm. Given a set of spectra of different classes, i.e. with different positions and heights of the peaks, this procedure can extract peaks by means of unsupervised learning. Instead of an $l_1$-regularization penalty term used in the original sparse coding algorithm we propose using an elastic-net penalty term for better regularization. The evaluation is done by means of simulation. We show that for a large region of parameters the proposed peak picking method based on the sparse coding features outperforms a mean spectrum-based method. Moreover, we demonstrate the procedure applying it to two real-life datasets.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

0907.3426

Country: Europe > Germany > Bremen > Bremen (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology:

Information Technology > Data Science > Data Mining > Feature Extraction (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.32)

Add feedback

Initialization Free Graph Based Clustering

Galluccio, Laurent, Michel, Olivier J. J., Comon, Pierre, Slezak, Eric, Hero, Alfred O.

arXiv.org Machine LearningSep-24-2009

This paper proposes an original approach to cluster multi-component data sets, including an estimation of the number of clusters. From the construction of a minimal spanning tree with Prim's algorithm, and the assumption that the vertices are approximately distributed according to a Poisson distribution, the number of clusters is estimated by thresholding the Prim's trajectory. The corresponding cluster centroids are then computed in order to initialize the generalized Lloyd's algorithm, also known as $K$-means, which allows to circumvent initialization problems. Some results are derived for evaluating the false positive rate of our cluster detection algorithm, with the help of approximations relevant in Euclidean spaces. Metrics used for measuring similarity between multi-dimensional data points are based on symmetrical divergences. The use of these informational divergences together with the proposed method leads to better results, compared to other clustering methods for the problem of astrophysical data processing. Some applications of this method in the multi/hyper-spectral imagery domain to a satellite view of Paris and to an image of the Mars planet are also presented. In order to demonstrate the usefulness of divergences in our problem, the method with informational divergence as similarity measure is compared with the same method using classical metrics. In the astrophysics application, we also compare the method with the spectral clustering algorithms.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

0909.4395

Country:

Europe > France (0.68)
North America > United States > California (0.46)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Ontology-Based Link Prediction in the LiveJournal Social Network

Caragea, Doina (Kansas State University) | Bahirwani, Vikas (Kansas State University) | Aljandal, Waleed (Kansas State University) | Hsu, William H. (Kansas State University)

AAAI ConferencesSep-1-2009

LiveJournal is a social network journal service with focus on user interactions. As for many other online social networks, predicting potential friendships in the LiveJournal network is a problem of great practical interest. Previous work has shown that graph features extracted from the graph associated with the network are good predictors for friendship links. However, contrary to the intuition, user data (e.g., interests shared by two users) does not always improve the predictions obtained with graph features alone. This could be due to the fact that features constructed from a large number of user declared interests cannot capture the implicit semantic of the interests. To test this hypothesis, we use a clustering approach to build an interest ontology, and explore the ability of the ontology to improve the performance of learning algorithms at predicting friendship links, when interest-based features are used alone or in combination with graph-based features. The results show that ontology-based features can help improve the performance of several machine learning classiﬁers (in particular, random forest classiﬁers) at the task of predicting links in the LiveJournal social network.

algorithm, classifier, ontology, (14 more...)

AAAI Conferences

Eighth Symposium on Abstraction, Reformulation, and Approximation

Country:

Oceania > New Zealand > North Island > Waikato (0.04)
North America > United States > Kansas > Riley County > Manhattan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Services (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

High-dimensional variable selection

Wasserman, Larry, Roeder, Kathryn

arXiv.org Machine LearningAug-20-2009

This paper explores the following question: what kind of statistical guarantees can be given when doing variable selection in high-dimensional models? In particular, we look at the error rates and power of some multi-stage regression methods. In the first stage we fit a set of candidate models. In the second stage we select one model by cross-validation. In the third stage we use hypothesis testing to eliminate some variables. We refer to the first two stages as "screening" and the last stage as "cleaning." We consider three screening methods: the lasso, marginal regression, and forward stepwise regression. Our method gives consistent variable selection under certain conditions.

artificial intelligence, assumption, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1214/08-AOS646

0704.1139

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

A new protein binding pocket similarity measure based on comparison of 3D atom clouds: application to ligand prediction

Hoffmann, Brice, Zaslavskiy, Mikhail, Vert, Jean-Philippe, Stoven, Véronique

arXiv.org Machine LearningJul-9-2009

Motivation: Prediction of ligands for proteins of known 3D structure is important to understand structure-function relationship, predict molecular function, or design new drugs. Results: We explore a new approach for ligand prediction in which binding pockets are represented by atom clouds. Each target pocket is compared to an ensemble of pockets of known ligands. Pockets are aligned in 3D space with further use of convolution kernels between clouds of points. Performance of the new method for ligand prediction is compared to those of other available measures and to docking programs. We discuss two criteria to compare the quality of similarity measures: area under ROC curve (AUC) and classification based scores. We show that the latter is better suited to evaluate the methods with respect to ligand prediction. Our results on existing and new benchmarks indicate that the new method outperforms other approaches, including docking. Availability: The new method is available at http://cbio.ensmp.fr/paris/ Contact: mikhail.zaslavskiy@mines-paristech.fr

artificial intelligence, ligand, machine learning, (18 more...)

arXiv.org Machine Learning

0907.1531

Genre: Research Report > New Finding (0.49)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.38)

Add feedback