AITopics

Country: North America > United States (0.28)

Industry: Education (0.34)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Pelleg, Dan, Moore, Andrew W.

Using Tarjan's Red Rule for Fast Dependency Tree Construction

We focus on the problem of efficient learning of dependency trees. It is well-known that given the pairwise mutual information coefficients, a minimum-weight spanning tree algorithm solves this problem exactly and in polynomial time. However, for large data-sets it is the construction of the correlation matrix that dominates the running time. We have developed a new spanning-tree algorithm which is capable of exploiting partial knowledge about edge weights. The partial knowledge we maintain is a probabilistic confidence interval on the coefficients, which we derive by examining just a small sample of the data. The algorithm is able to flag the need to shrink an interval, which translates to inspection of more data for the particular attribute pair. Experimental results show running time that is near-constant in the number of records, without significant loss in accuracy of the generated trees. Interestingly, our spanning-tree algorithm is based solely on Tarjan's red-edge rule, which is generally considered a guaranteed recipe for bad performance.

algorithm, artificial intelligence, data mining, (15 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Colorado (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > British Columbia (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Data Science > Data Mining (0.68)

Hughes, Nicholas P., Lowe, David

Artefactual Structure from Least-Squares Multidimensional Scaling

We consider the problem of illusory or artefactual structure from the visualisation of high-dimensional structureless data. In particular we examine the role of the distance metric in the use of topographic mappings based on the statistical field of multidimensional scaling. We show that the use of a squared Euclidean metric (i.e. the SS

artificial intelligence, configuration, data mining, (18 more...)

Country:

Europe > United Kingdom (0.14)
Europe > Italy (0.14)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Pelleg, Dan, Moore, Andrew W.

Using Tarjan's Red Rule for Fast Dependency Tree Construction

We focus on the problem of efficient learning of dependency trees. It is well-known that given the pairwise mutual information coefficients, a minimum-weight spanning tree algorithm solves this problem exactly and in polynomial time. However, for large data-sets it is the construction of the correlation matrix that dominates the running time. We have developed a new spanning-tree algorithm which is capable of exploiting partial knowledge about edge weights. The partial knowledge we maintain is a probabilistic confidence interval on the coefficients, which we derive by examining just a small sample of the data. The algorithm is able to flag the need to shrink an interval, which translates to inspection of more data for the particular attribute pair. Experimental results show running time that is near-constant in the number of records, without significant loss in accuracy of the generated trees. Interestingly, our spanning-tree algorithm is based solely on Tarjan's red-edge rule, which is generally considered a guaranteed recipe for bad performance.

algorithm, artificial intelligence, data mining, (15 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Colorado (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > British Columbia (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Data Science > Data Mining (0.68)

Hughes, Nicholas P., Lowe, David

Artefactual Structure from Least-Squares Multidimensional Scaling

We consider the problem of illusory or artefactual structure from the visualisation ofhigh-dimensional structureless data. In particular we examine the role of the distance metric in the use of topographic mappings based on the statistical field of multidimensional scaling. We show that the use of a squared Euclidean metric (i.e. the SSTRESS measure) gives rise to an annular structure when the input data is drawn from a highdimensional isotropicdistribution, and we provide a theoretical justification for this observation.

artificial intelligence, configuration, data mining, (19 more...)

Country:

Europe > United Kingdom (0.14)
Europe > Italy (0.14)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Data Science > Data Mining (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Pelleg, Dan, Moore, Andrew W.

Using Tarjan's Red Rule for Fast Dependency Tree Construction

We focus on the problem of efficient learning of dependency trees. It is well-known that given the pairwise mutual information coefficients, a minimum-weight spanning tree algorithm solves this problem exactly and in polynomial time. However, for large data-sets it is the construction ofthe correlation matrix that dominates the running time. We have developed a new spanning-tree algorithm which is capable of exploiting partial knowledge about edge weights. The partial knowledge we maintain isa probabilistic confidence interval on the coefficients, which we derive by examining just a small sample of the data. The algorithm is able to flag the need to shrink an interval, which translates to inspection ofmore data for the particular attribute pair. Experimental results show running time that is near-constant in the number of records, without significantloss in accuracy of the generated trees. Interestingly, our spanning-tree algorithm is based solely on Tarjan's red-edge rule, which is generally considered a guaranteed recipe for bad performance.

algorithm, artificial intelligence, data mining, (16 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Colorado (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Data Science > Data Mining (0.68)

Vert, Jean-philippe, Kanehisa, Minoru

Graph-Driven Feature Extraction From Microarray Data Using Diffusion Kernels and Kernel CCA

We present an algorithm to extract features from high-dimensional gene expression profiles, based on the knowledge of a graph which links together genesknown to participate to successive reactions in metabolic pathways. Motivated by the intuition that biologically relevant features are likely to exhibit smoothness with respect to the graph topology, the algorithm involves encoding the graph and the set of expression profiles intokernel functions, and performing a generalized form of canonical correlation analysis in the corresponding reproducible kernel Hilbert spaces. Functionprediction experiments for the genes of the yeast S. Cerevisiae validate this approach by showing a consistent increase in performance when a state-of-the-art classifier uses the vector of features instead of the original expression profile to predict the functional class of a gene.

artificial intelligence, expression profile, health & medicine, (19 more...)

Country:

North America (0.15)
Asia (0.14)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Data Science > Data Mining (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)

Ghaoui, Laurent E., Jordan, Michael I., Lanckriet, Gert R.

Robust Novelty Detection with Single-Class MPM

This algorithm-the "single-class minimax probability machine(MPM)"- is built on a distribution-free methodology that minimizes the worst-case probability of a data point falling outside of a convex set, given only the mean and covariance matrix of the distribution and making no further distributional assumptions. Wepresent a robust approach to estimating the mean and covariance matrix within the general two-class MPM setting, and show how this approach specializes to the single-class problem. We provide empirical results comparing the single-class MPM to the single-class SVM and a two-class SVM method. 1 Introduction Novelty detection is an important unsupervised learning problem in which test data are to be judged as having been generated from the same or a different process as that which generated the training data.

artificial intelligence, covariance matrix, data mining, (15 more...)

Country: North America > United States (0.28)

Industry: Education (0.34)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

In Search of the Horowitz Factor

Widmer, Gerhard, Dixon, Simon, Goebl, Werner, Pampalk, Elias, Tobudic, Asmir

AI MagazineSep-15-2003

The article introduces the reader to a large interdisciplinary research project whose goal is to use AI to gain new insight into a complex artistic phenomenon. We study fundamental principles of expressive music performance by measuring performance aspects in large numbers of recordings by highly skilled musicians (concert pianists) and analyzing the data with state-of-the-art methods from areas such as machine learning, data mining, and data visualization. The article first introduces the general research questions that guide the project and then summarizes some of the most important results achieved to date, with an emphasis on the most recent and still rather speculative work. A broad view of the discovery process is given, from data acquisition through data visualization to inductive model building and pattern discovery, and it turns out that AI plays an important role in all stages of such an ambitious enterprise. Our current results show that it is possible for machines to make novel and interesting discoveries even in a domain such as music and that even if we might never find the "Horowitz Factor," AI can give us completely new insights into complex artistic behavior.

artificial intelligence, inductive learning, pianist, (19 more...)

AI Magazine

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.86)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)

Kivinen, Jyrki, Smola, Alex J., Williamson, Robert C.

Online Learning with Kernels

Neural Information Processing SystemsDec-31-2002

We consider online learning in a Reproducing Kernel Hilbert Space. Our method is computationally efficient and leads to simple algorithms. In particular we derive update equations for classification, regression, and novelty detection. The inclusion of the -trick allows us to give a robust parameterization.

algorithm, computer based training, educational technology, (20 more...)