Discriminant Analysis
Sparse Linear Discriminant Analysis under the Neyman-Pearson Paradigm
Tong, Xin, Xia, Lucy, Wang, Jiacheng, Feng, Yang
In classification applications such as severe disease diagnosis and fraud detection, people have clear priorities over the two types of classification errors. For instance, diagnosing a patient with cancer to be healthy may lead to loss of life, which incurs a much higher cost than the other way around. The classical binary classification paradigm does not take into account such priorities, as it aims to minimize the overall classification error. In contrast, the Neyman-Pearson (NP) paradigm seeks classifiers with a minimal type II error while having the prioritized type I error constrained under a user-specified level, addressing asymmetric type I/II error priorities in the previously mentioned scenarios. Despite recent advances in the NP classification literature, two essential issues pose challenges: i) current theoretical framework assumes bounded feature support, which does not admit parametric settings; ii) in practice, existing NP classifiers involve splitting class 0 samples into two parts using a pre-fixed split proportion. To address the first challenge, we present NP-sLDA that adapts the popular sparse linear discriminant analysis (sLDA, Mai et al. (2012)) to the NP paradigm. On the theoretical front, this is the first theoretically justified NP classifier that takes parametric assumptions and unbounded feature support. We formulate a new conditional margin assumption and a new conditional detection condition to accommodate unbounded feature support and show that NP-sLDA satisfies the NP oracle inequalities. Numerical results show that NP-sLDA is a valuable addition to the existing NP classifiers. To address the second challenge, we construct a general data-adaptive sample splitting scheme that improves the classification performance upon the default half-half class 0 split used in Tong et al. (2018).
Generalized two-dimensional linear discriminant analysis with regularization
Li, Chun-Na, Shao, Yuan-Hai, Chen, Wei-Jie, Deng, Nai-Yang
Recent advances show that two-dimensional linear discriminant analysis (2DLDA) is a successful matrix based dimensionality reduction method. However, 2DLDA may encounter the singularity issue theoretically and the sensitivity to outliers. In this paper, a generalized Lp-norm 2DLDA framework with regularization for an arbitrary $p>0$ is proposed, named G2DLDA. There are mainly two contributions of G2DLDA: one is G2DLDA model uses an arbitrary Lp-norm to measure the between-class and within-class scatter, and hence a proper $p$ can be selected to achieve the robustness. The other one is that by introducing an extra regularization term, G2DLDA achieves better generalization performance, and solves the singularity problem. In addition, G2DLDA can be solved through a series of convex problems with equality constraint, and it has closed solution for each single problem. Its convergence can be guaranteed theoretically when $1\leq p\leq2$. Preliminary experimental results on three contaminated human face databases show the effectiveness of the proposed G2DLDA.
Joint Probabilistic Linear Discriminant Analysis
Standard probabilistic linear discriminant analysis (PLDA) for speaker recognition assumes that the sample's features (usually, i-vectors) are given by a sum of three terms: a term that depends on the speaker identity, a term that models the within-speaker variability and is assumed independent across samples, and a final term that models any remaining variability and is also independent across samples. In this work, we propose a generalization of this model where the within-speaker variability is not necessarily assumed independent across samples but dependent on another discrete variable. This variable, which we call the channel variable as in the standard PLDA approach, could be, for example, a discrete category for the channel characteristics, the language spoken by the speaker, the type of speech in the sample (conversational, monologue, read), etc. The value of this variable is assumed to be known during training but not during testing. Scoring is performed, as in standard PLDA, by computing a likelihood ratio between the null hypothesis that the two sides of a trial belong to the same speaker versus the alternative hypothesis that the two sides belong to different speakers. The two likelihoods are computed by marginalizing over two hypothesis about the channels in both sides of a trial: that they are the same and that they are different. This way, we expect that the new model will be better at coping with same-channel versus different-channel trials than standard PLDA, since knowledge about the channel (or language, or speech style) is used during training and implicitly considered during scoring.
Linear Discriminant Generative Adversarial Networks
Sun, Zhun, Ozay, Mete, Okatani, Takayuki
We develop a novel method for training of GANs for unsupervised and class conditional generation of images, called Linear Discriminant GAN (LD-GAN). The discriminator of an LD-GAN is trained to maximize the linear separability between distributions of hidden representations of generated and targeted samples, while the generator is updated based on the decision hyper-planes computed by performing LDA over the hidden representations. LD-GAN provides a concrete metric of separation capacity for the discriminator, and we experimentally show that it is possible to stabilize the training of LD-GAN simply by calibrating the update frequencies between generators and discriminators in the unsupervised case, without employment of normalization methods and constraints on weights. In the class conditional generation tasks, the proposed method shows improved training stability together with better generalization performance compared to WGAN that employs an auxiliary classifier.
Communication-efficient Distributed Sparse Linear Discriminant Analysis
High dimensionality is a frequently confronted problem in many applications of machine learning. It increases time and space requirements for processing the data. Moreover, many machine learning methods tend to over-fit and become less interpretable in the presence of many irrelevant or redundant features. A common way to address this problem is the dimensionality reduction. Principal Component Analysis (PCA) (Jolliffe, 2002) is probably the most widely used dimensionality reduction method. However, it is an unsupervised dimensionality reduction method and does not consider the labels of the data. In order to take the label information into account, supervised dimensionality reduction methods are favored. Linear Discriminant Analysis (LDA) (Anderson, 1968), which is initially proposed as a classification method, is an important supervised dimensionality reduction method.
Kernel Alignment Inspired Linear Discriminant Analysis
Kernel alignment measures the degree of similarity between two kernels. In this paper, inspired from kernel alignment, we propose a new Linear Discriminant Analysis (LDA) formulation, kernel alignment LDA (kaLDA). We first define two kernels, data kernel and class indicator kernel. The problem is to find a subspace to maximize the alignment between subspace-transformed data kernel and class indicator kernel. Surprisingly, the kernel alignment induced kaLDA objective function is very similar to classical LDA and can be expressed using between-class and total scatter matrices. This can be extended to multi-label data. We use a Stiefel-manifold gradient descent algorithm to solve this problem. We perform experiments on 8 single-label and 6 multi-label data sets. Results show that kaLDA has very good performance on many single-label and multi-label problems.
Linear Discriminant Analysis
Linear Discriminant Analysis (LDA) is most commonly used as dimensionality reduction technique in the pre-processing step for pattern-classification and machine learning applications. The goal is to project a dataset onto a lower-dimensional space with good class-separability in order avoid overfitting ("curse of dimensionality") and also reduce computational costs. Ronald A. Fisher formulated the Linear Discriminant in 1936 (The Use of Multiple Measurements in Taxonomic Problems), and it also has some practical uses as classifier. The original Linear discriminant was described for a 2-class problem, and it was then later generalized as "multi-class Linear Discriminant Analysis" or "Multiple Discriminant Analysis" by C. R. Rao in 1948 (The utilization of multiple measurements in problems of biological classification) The general LDA approach is very similar to a Principal Component Analysis (for more information about the PCA, see the previous article Implementing a Principal Component Analysis (PCA) in Python step by step), but in addition to finding the component axes that maximize the variance of our data (PCA), we are additionally interested in the axes that maximize the separation between multiple classes (LDA). So, in a nutshell, often the goal of an LDA is to project a feature space (a dataset n-dimensional samples) onto a smaller subspace (where) while maintaining the class-discriminatory information.
Linear Discriminant Analysis for Machine Learning
Logistic regression is a classification algorithm traditionally limited to only two-class classification problems. If you have more than two classes then Linear Discriminant Analysis is the preferred linear classification technique. In this post you will discover the Linear Discriminant Analysis (LDA) algorithm for classification predictive modeling problems. This post is intended for developers interested in applied machine learning, how the models work and how to use them well. As such no background in statistics or linear algebra is required, although it does help if you know about the mean and variance of a distribution.
Linear Discriminant Analysis for Machine Learning - Machine Learning Mastery
Logistic regression is a classification algorithm traditionally limited to only two-class classification problems. If you have more than two classes then Linear Discriminant Analysis is the preferred linear classification technique. In this post you will discover the Linear Discriminant Analysis (LDA) algorithm for classification predictive modeling problems. This post is intended for developers interested in applied machine learning, how the models work and how to use them well. As such no background in statistics or linear algebra is required, although it does help if you know about the mean and variance of a distribution.