AITopics

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Colorado (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > British Columbia (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Data Science > Data Mining (0.68)

Vert, Jean-philippe, Kanehisa, Minoru

Graph-Driven Feature Extraction From Microarray Data Using Diffusion Kernels and Kernel CCA

We present an algorithm to extract features from high-dimensional gene expression profiles, based on the knowledge of a graph which links together genes known to participate to successive reactions in metabolic pathways. Motivated by the intuition that biologically relevant features are likely to exhibit smoothness with respect to the graph topology, the algorithm involves encoding the graph and the set of expression profiles into kernel functions, and performing a generalized form of canonical correlation analysis in the corresponding reproducible kernel Hilbert spaces. Function prediction experiments for the genes of the yeast S. Cerevisiae validate this approach by showing a consistent increase in performance when a state-of-the-art classifier uses the vector of features instead of the original expression profile to predict the functional class of a gene.

artificial intelligence, expression profile, health & medicine, (18 more...)

Country:

North America (0.15)
Asia (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Roth, Volker, Laub, Julian, Müller, Klaus-Robert, Buhmann, Joachim M.

Going Metric: Denoising Pairwise Data

Pairwise data in empirical sciences typically violate metricity, either dueto noise or due to fallible estimates, and therefore are hard to analyze by conventional machine learning technology. In this paper we therefore study ways to work around this problem. First, we present an alternative embedding to multidimensional scaling (MDS) that allows us to apply a variety of classical machine learningand signal processing algorithms. The class of pairwise grouping algorithms which share the shift-invariance property is statistically invariant under this embedding procedure, leading to identical assignments of objects to clusters. Based on this new vectorial representation, denoising methods are applied in a second step.Both steps provide a theoretically well controlled setup to translate from pairwise data to the respective denoised metric representation.We demonstrate the practical usefulness of our theoretical reasoning by discovering structure in protein sequence data bases, visibly improving performance upon existing automatic methods. 1 Introduction Unsupervised grouping or clustering aims at extracting hidden structure from data (see e.g.

artificial intelligence, health & medicine, sequence, (17 more...)

Country: Europe > Germany (0.48)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.95)

Hughes, Nicholas P., Lowe, David

Artefactual Structure from Least-Squares Multidimensional Scaling

We consider the problem of illusory or artefactual structure from the visualisation ofhigh-dimensional structureless data. In particular we examine the role of the distance metric in the use of topographic mappings based on the statistical field of multidimensional scaling. We show that the use of a squared Euclidean metric (i.e. the SSTRESS measure) gives rise to an annular structure when the input data is drawn from a highdimensional isotropicdistribution, and we provide a theoretical justification for this observation.

artificial intelligence, configuration, data mining, (19 more...)

Country:

Europe > United Kingdom (0.14)
Europe > Italy (0.14)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Data Science > Data Mining (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Pelleg, Dan, Moore, Andrew W.

Using Tarjan's Red Rule for Fast Dependency Tree Construction

We focus on the problem of efficient learning of dependency trees. It is well-known that given the pairwise mutual information coefficients, a minimum-weight spanning tree algorithm solves this problem exactly and in polynomial time. However, for large data-sets it is the construction ofthe correlation matrix that dominates the running time. We have developed a new spanning-tree algorithm which is capable of exploiting partial knowledge about edge weights. The partial knowledge we maintain isa probabilistic confidence interval on the coefficients, which we derive by examining just a small sample of the data. The algorithm is able to flag the need to shrink an interval, which translates to inspection ofmore data for the particular attribute pair. Experimental results show running time that is near-constant in the number of records, without significantloss in accuracy of the generated trees. Interestingly, our spanning-tree algorithm is based solely on Tarjan's red-edge rule, which is generally considered a guaranteed recipe for bad performance.

algorithm, artificial intelligence, data mining, (16 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Colorado (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Data Science > Data Mining (0.68)

Girard, Agathe, Rasmussen, Carl Edward, Candela, Joaquin Quiñonero, Murray-Smith, Roderick

Gaussian Process Priors with Uncertain Inputs Application to Multiple-Step Ahead Time Series Forecasting

We consider the problem of multi-step ahead prediction in time series analysis using the nonparametric Gaussian process model.

artificial intelligence, machine learning, prediction, (15 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
Europe > Denmark > Capital Region > Kongens Lyngby (0.14)

Genre: Instructional Material (0.47)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)
Information Technology > Data Science (0.85)

Vert, Jean-philippe, Kanehisa, Minoru

Graph-Driven Feature Extraction From Microarray Data Using Diffusion Kernels and Kernel CCA

We present an algorithm to extract features from high-dimensional gene expression profiles, based on the knowledge of a graph which links together genesknown to participate to successive reactions in metabolic pathways. Motivated by the intuition that biologically relevant features are likely to exhibit smoothness with respect to the graph topology, the algorithm involves encoding the graph and the set of expression profiles intokernel functions, and performing a generalized form of canonical correlation analysis in the corresponding reproducible kernel Hilbert spaces. Functionprediction experiments for the genes of the yeast S. Cerevisiae validate this approach by showing a consistent increase in performance when a state-of-the-art classifier uses the vector of features instead of the original expression profile to predict the functional class of a gene.

artificial intelligence, expression profile, health & medicine, (19 more...)

Country:

North America (0.15)
Asia (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Data Science > Data Mining (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)

Kannan, Anitha, Jojic, Nebojsa, Frey, Brendan

Fast Transformation-Invariant Factor Analysis

Dimensionality reduction techniques such as principal component analysis andfactor analysis are used to discover a linear mapping between high dimensional data samples and points in a lower dimensional subspace. In [6], Jojic and Frey introduced mixture of transformation-invariant component analyzers (MTCA) that can account for global transformations suchas translations and rotations, perform clustering and learn local appearance deformations by dimensionality reduction.

artificial intelligence, machine learning, sequence, (15 more...)

Country:

North America > United States (0.69)
North America > Canada > Ontario > Toronto (0.15)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Ghaoui, Laurent E., Jordan, Michael I., Lanckriet, Gert R.

Robust Novelty Detection with Single-Class MPM

This algorithm-the "single-class minimax probability machine(MPM)"- is built on a distribution-free methodology that minimizes the worst-case probability of a data point falling outside of a convex set, given only the mean and covariance matrix of the distribution and making no further distributional assumptions. Wepresent a robust approach to estimating the mean and covariance matrix within the general two-class MPM setting, and show how this approach specializes to the single-class problem. We provide empirical results comparing the single-class MPM to the single-class SVM and a two-class SVM method. 1 Introduction Novelty detection is an important unsupervised learning problem in which test data are to be judged as having been generated from the same or a different process as that which generated the training data.

artificial intelligence, covariance matrix, data mining, (15 more...)

Country: North America > United States (0.28)

Industry: Education (0.34)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)