AITopics

Country: Asia (0.28)

Genre: Research Report (0.31)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Strohmann, Thomas, Grudic, Gregory Z.

A Formulation for Minimax Probability Machine Regression

Neural Information Processing SystemsDec-31-2003

We formulate the regression problem as one of maximizing the minimum probability,symbolized by Ω, that future predicted outputs of the regression model will be within some ε bound of the true regression function. Our formulation is unique in that we obtain a direct estimate of this lower probability bound Ω. The proposed framework, minimax probability machine regression (MPMR), is based on the recently described minimaxprobability machine classification algorithm [Lanckriet et al.] and uses Mercer Kernels to obtain nonlinear regression models. MPMR is tested on both toy and real world data, verifying the accuracy of the Ω bound, and the efficacy of the regression models.

artificial intelligence, machine learning, probability, (14 more...)

Country:

North America > United States > Colorado (0.15)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Neural Information Processing SystemsDec-31-2003

Generalized² Linear² Models

Gordon, Geoffrey J.

We introduce the Generalized2 Linear2 Model, a statistical estimator whichcombines features of nonlinear regression and factor analysis.

artificial intelligence, loss function, machine learning, (19 more...)

Country: North America > United States (0.94)

Industry: Government > Regional Government > North America Government > United States Government (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Neural Information Processing SystemsDec-31-2003

Spikernels: Embedding Spiking Neurons in Inner-Product Spaces

Shpigelman, Lavi, Singer, Yoram, Paz, Rony, Vaadia, Eilon

Inner-product operators, often referred to as kernels in statistical learning, define amapping from some input space into a feature space. The focus of this paper is the construction of biologically-motivated kernels for cortical activities. Thekernels we derive, termed Spikernels, map spike count sequences into an abstract vector space in which we can perform various prediction tasks. We discuss in detail the derivation of Spikernels and describe an efficient algorithm forcomputing their value on any two sequences of neural population spike counts. We demonstrate the merits of our modeling approach using the Spikernel and various standard kernels for the task of predicting hand movement velocitiesfrom cortical recordings. In all of our experiments all the kernels we tested outperform the standard scalar product used in regression with the Spikernel consistently achieving the best performance.

artificial intelligence, kernel, machine learning, (16 more...)

Country: Asia > Middle East > Israel (0.28)

Genre: Research Report (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.35)

Kernel Logistic Regression and the Import Vector Machine

Zhu, Ji, Hastie, Trevor

The support vector machine (SVM) is known for its good performance in binary classification, but its extension to multi-class classification is still an ongoing research issue. In this paper, we propose a new approach for classification, called the import vector machine (IVM), which is built on kernel logistic regression (KLR). We show that the IVM not only performs as well as the SVM in binary classification, but also can naturally be generalized to the multi-class case. Furthermore, the IVM provides an estimate of the underlying probability. Similar to the "support points" of the SVM, the IVM model uses only a fraction of the training data to index kernel basis functions, typically a much smaller fraction than the SVM. This gives the IVM a computational advantage over the SVM, especially when the size of the training data set is large.

import point, svm, training data, (12 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.15)
North America > United States > California > Santa Clara County > Stanford (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > New York (0.04)

Genre:

Research Report > New Finding (0.71)
Research Report > Experimental Study (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.57)

Rasmussen, Carl E., Ghahramani, Zoubin

Infinite Mixtures of Gaussian Process Experts

We present an extension to the Mixture of Experts (ME) model, where the individual experts are Gaussian Process (GP) regression models. Using an input-dependent adaptation of the Dirichlet Process, we implement a gating network for an infinite number of Experts. Inference in this model may be done efficiently using a Markov Chain relying on Gibbs sampling. The model allows the effective covariance function to vary with the inputs, and may handle large datasets - thus potentially overcoming two of the biggest hurdles with GP models.

covariance function, hyperparameter, indicator variable, (15 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Ng, Andrew Y., Jordan, Michael I.

On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes

Discriminative classifiers model the posterior p(ylx) directly, or learn a direct map from inputs x to the class labels. There are several compelling reasons for using discriminative rather than generative classifiers, one of which, succinctly articulated by Vapnik [6], is that "one should solve the [classification] problem directly and never solve a more general problem as an intermediate step [such as modeling p(xly)]." Indeed, leaving aside computational issues and matters such as handling missing data, the prevailing consensus seems to be that discriminative classifiers are almost always to be preferred to generative ones. Another piece of prevailing folk wisdom is that the number of examples needed to fit a model is often roughly linear in the number of free parameters of a model. This has its theoretical basis in the observation that for "many" models, the VC dimension is roughly linear or at most some low-order polynomial in the number of parameters (see, e.g., [1, 3]), and it is known that sample complexity in the discriminative setting is linear in the VC dimension [6]. In this paper, we study empirically and theoretically the extent to which these beliefs are true. A parametric family of probabilistic models p(x, y) can be fit either to optimize the joint likelihood of the inputs and the labels, or fit to optimize the conditional likelihood p(ylx), or even fit to minimize the 0-1 training error obtained by thresholding p(ylx) to make predictions.

classifier, logistic regression, regression, (13 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (0.56)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.88)

Kernel Logistic Regression and the Import Vector Machine

Zhu, Ji, Hastie, Trevor

The support vector machine (SVM) is known for its good performance in binary classification, but its extension to multi-class classification is still an ongoing research issue. In this paper, we propose a new approach for classification, called the import vector machine (IVM), which is built on kernel logistic regression (KLR). We show that the IVM not only performs as well as the SVM in binary classification, but also can naturally be generalized to the multi-class case. Furthermore, the IVM provides an estimate of the underlying probability. Similar to the "support points" of the SVM, the IVM model uses only a fraction of the training data to index kernel basis functions, typically a much smaller fraction than the SVM. This gives the IVM a computational advantage over the SVM, especially when the size of the training data set is large.

import point, svm, training data, (12 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.15)
North America > United States > California > Santa Clara County > Stanford (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > New York (0.04)

Genre:

Research Report > New Finding (0.71)
Research Report > Experimental Study (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.57)

Rasmussen, Carl E., Ghahramani, Zoubin

Infinite Mixtures of Gaussian Process Experts

We present an extension to the Mixture of Experts (ME) model, where the individual experts are Gaussian Process (GP) regression models. Using an input-dependent adaptation of the Dirichlet Process, we implement a gating network for an infinite number of Experts. Inference in this model may be done efficiently using a Markov Chain relying on Gibbs sampling. The model allows the effective covariance function to vary with the inputs, and may handle large datasets - thus potentially overcoming two of the biggest hurdles with GP models.

covariance function, hyperparameter, indicator variable, (15 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Ng, Andrew Y., Jordan, Michael I.

On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes

Discriminative classifiers model the posterior p(ylx) directly, or learn a direct map from inputs x to the class labels. There are several compelling reasons for using discriminative rather than generative classifiers, one of which, succinctly articulated by Vapnik [6], is that "one should solve the [classification] problem directly and never solve a more general problem as an intermediate step [such as modeling p(xly)]." Indeed, leaving aside computational issues and matters such as handling missing data, the prevailing consensus seems to be that discriminative classifiers are almost always to be preferred to generative ones. Another piece of prevailing folk wisdom is that the number of examples needed to fit a model is often roughly linear in the number of free parameters of a model. This has its theoretical basis in the observation that for "many" models, the VC dimension is roughly linear or at most some low-order polynomial in the number of parameters (see, e.g., [1, 3]), and it is known that sample complexity in the discriminative setting is linear in the VC dimension [6]. In this paper, we study empirically and theoretically the extent to which these beliefs are true. A parametric family of probabilistic models p(x, y) can be fit either to optimize the joint likelihood of the inputs and the labels, or fit to optimize the conditional likelihood p(ylx), or even fit to minimize the 0-1 training error obtained by thresholding p(ylx) to make predictions.

classifier, logistic regression, regression, (13 more...)