Goto

Collaborating Authors

 Discriminant Analysis


Generalized two-dimensional linear discriminant analysis with regularization

Li, Chun-Na, Shao, Yuan-Hai, Chen, Wei-Jie, Deng, Nai-Yang

arXiv.org Machine Learning

Recent advances show that two-dimensional linear discriminant analysis (2DLDA) is a successful matrix based dimensionality reduction method. However, 2DLDA may encounter the singularity issue theoretically and the sensitivity to outliers. In this paper, a generalized Lp-norm 2DLDA framework with regularization for an arbitrary $p>0$ is proposed, named G2DLDA. There are mainly two contributions of G2DLDA: one is G2DLDA model uses an arbitrary Lp-norm to measure the between-class and within-class scatter, and hence a proper $p$ can be selected to achieve the robustness. The other one is that by introducing an extra regularization term, G2DLDA achieves better generalization performance, and solves the singularity problem. In addition, G2DLDA can be solved through a series of convex problems with equality constraint, and it has closed solution for each single problem. Its convergence can be guaranteed theoretically when $1\leq p\leq2$. Preliminary experimental results on three contaminated human face databases show the effectiveness of the proposed G2DLDA.

  Country:
  Genre: Research Report > New Finding (0.68)
  Industry: Health & Medicine (0.92)
  Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Discriminant Analysis (0.62)

Joint Probabilistic Linear Discriminant Analysis

Ferrer, Luciana

arXiv.org Machine Learning

Standard probabilistic linear discriminant analysis (PLDA) for speaker recognition assumes that the sample's features (usually, i-vectors) are given by a sum of three terms: a term that depends on the speaker identity, a term that models the within-speaker variability and is assumed independent across samples, and a final term that models any remaining variability and is also independent across samples. In this work, we propose a generalization of this model where the within-speaker variability is not necessarily assumed independent across samples but dependent on another discrete variable. This variable, which we call the channel variable as in the standard PLDA approach, could be, for example, a discrete category for the channel characteristics, the language spoken by the speaker, the type of speech in the sample (conversational, monologue, read), etc. The value of this variable is assumed to be known during training but not during testing. Scoring is performed, as in standard PLDA, by computing a likelihood ratio between the null hypothesis that the two sides of a trial belong to the same speaker versus the alternative hypothesis that the two sides belong to different speakers. The two likelihoods are computed by marginalizing over two hypothesis about the channels in both sides of a trial: that they are the same and that they are different. This way, we expect that the new model will be better at coping with same-channel versus different-channel trials than standard PLDA, since knowledge about the channel (or language, or speech style) is used during training and implicitly considered during scoring.


Market Basket Analysis & Linear Discriminant Analysis with R

@machinelearnbot

Get your team access to Udemy's top 2,000 courses anytime, anywhere. This course has two parts. In part 1 Association rules (Market Basket Analysis) is explained. In Part 2, Linear Discriminant Analysis (LDA) is explained.


Linear Discriminant Generative Adversarial Networks

Sun, Zhun, Ozay, Mete, Okatani, Takayuki

arXiv.org Machine Learning

We develop a novel method for training of GANs for unsupervised and class conditional generation of images, called Linear Discriminant GAN (LD-GAN). The discriminator of an LD-GAN is trained to maximize the linear separability between distributions of hidden representations of generated and targeted samples, while the generator is updated based on the decision hyper-planes computed by performing LDA over the hidden representations. LD-GAN provides a concrete metric of separation capacity for the discriminator, and we experimentally show that it is possible to stabilize the training of LD-GAN simply by calibrating the update frequencies between generators and discriminators in the unsupervised case, without employment of normalization methods and constraints on weights. In the class conditional generation tasks, the proposed method shows improved training stability together with better generalization performance compared to WGAN that employs an auxiliary classifier.


Communication-efficient Distributed Sparse Linear Discriminant Analysis

Tian, Lu, Gu, Quanquan

arXiv.org Machine Learning

High dimensionality is a frequently confronted problem in many applications of machine learning. It increases time and space requirements for processing the data. Moreover, many machine learning methods tend to over-fit and become less interpretable in the presence of many irrelevant or redundant features. A common way to address this problem is the dimensionality reduction. Principal Component Analysis (PCA) (Jolliffe, 2002) is probably the most widely used dimensionality reduction method. However, it is an unsupervised dimensionality reduction method and does not consider the labels of the data. In order to take the label information into account, supervised dimensionality reduction methods are favored. Linear Discriminant Analysis (LDA) (Anderson, 1968), which is initially proposed as a classification method, is an important supervised dimensionality reduction method.


Kernel Alignment Inspired Linear Discriminant Analysis

Zheng, Shuai, Ding, Chris

arXiv.org Machine Learning

Kernel alignment measures the degree of similarity between two kernels. In this paper, inspired from kernel alignment, we propose a new Linear Discriminant Analysis (LDA) formulation, kernel alignment LDA (kaLDA). We first define two kernels, data kernel and class indicator kernel. The problem is to find a subspace to maximize the alignment between subspace-transformed data kernel and class indicator kernel. Surprisingly, the kernel alignment induced kaLDA objective function is very similar to classical LDA and can be expressed using between-class and total scatter matrices. This can be extended to multi-label data. We use a Stiefel-manifold gradient descent algorithm to solve this problem. We perform experiments on 8 single-label and 6 multi-label data sets. Results show that kaLDA has very good performance on many single-label and multi-label problems.


Linear Discriminant Analysis

#artificialintelligence

Linear Discriminant Analysis (LDA) is most commonly used as dimensionality reduction technique in the pre-processing step for pattern-classification and machine learning applications. The goal is to project a dataset onto a lower-dimensional space with good class-separability in order avoid overfitting ("curse of dimensionality") and also reduce computational costs. Ronald A. Fisher formulated the Linear Discriminant in 1936 (The Use of Multiple Measurements in Taxonomic Problems), and it also has some practical uses as classifier. The original Linear discriminant was described for a 2-class problem, and it was then later generalized as "multi-class Linear Discriminant Analysis" or "Multiple Discriminant Analysis" by C. R. Rao in 1948 (The utilization of multiple measurements in problems of biological classification) The general LDA approach is very similar to a Principal Component Analysis (for more information about the PCA, see the previous article Implementing a Principal Component Analysis (PCA) in Python step by step), but in addition to finding the component axes that maximize the variance of our data (PCA), we are additionally interested in the axes that maximize the separation between multiple classes (LDA). So, in a nutshell, often the goal of an LDA is to project a feature space (a dataset n-dimensional samples) onto a smaller subspace (where) while maintaining the class-discriminatory information.


Linear Discriminant Analysis for Machine Learning

#artificialintelligence

Logistic regression is a classification algorithm traditionally limited to only two-class classification problems. If you have more than two classes then Linear Discriminant Analysis is the preferred linear classification technique. In this post you will discover the Linear Discriminant Analysis (LDA) algorithm for classification predictive modeling problems. This post is intended for developers interested in applied machine learning, how the models work and how to use them well. As such no background in statistics or linear algebra is required, although it does help if you know about the mean and variance of a distribution.


Linear Discriminant Analysis for Machine Learning - Machine Learning Mastery

#artificialintelligence

Logistic regression is a classification algorithm traditionally limited to only two-class classification problems. If you have more than two classes then Linear Discriminant Analysis is the preferred linear classification technique. In this post you will discover the Linear Discriminant Analysis (LDA) algorithm for classification predictive modeling problems. This post is intended for developers interested in applied machine learning, how the models work and how to use them well. As such no background in statistics or linear algebra is required, although it does help if you know about the mean and variance of a distribution.


Robust Feature-Sample Linear Discriminant Analysis for Brain Disorders Diagnosis

Adeli-Mosabbeb, Ehsan, Thung, Kim-Han, An, Le, Shi, Feng, Shen, Dinggang

Neural Information Processing Systems

A wide spectrum of discriminative methods is increasingly used in diverse applications for classification or regression tasks. However, many existing discriminative methods assume that the input data is nearly noise-free, which limits their applications to solve real-world problems. Particularly for disease diagnosis, the data acquired by the neuroimaging devices are always prone to different sources of noise. Robust discriminative models are somewhat scarce and only a few attempts have been made to make them robust against noise or outliers. These methods focus on detecting either the sample-outliers or feature-noises. Moreover, they usually use unsupervised de-noising procedures, or separately de-noise the training and the testing data. All these factors may induce biases in the learning process, and thus limit its performance. In this paper, we propose a classification method based on the least-squares formulation of linear discriminant analysis, which simultaneously detects the sample-outliers and feature-noises. The proposed method operates under a semi-supervised setting, in which both labeled training and unlabeled testing data are incorporated to form the intrinsic geometry of the sample space. Therefore, the violating samples or feature values are identified as sample-outliers or feature-noises, respectively. We test our algorithm on one synthetic and two brain neurodegenerative databases (particularly for Parkinson's disease and Alzheimer's disease). The results demonstrate that our method outperforms all baseline and state-of-the-art methods, in terms of both accuracy and the area under the ROC curve.