Goto

Collaborating Authors

 Discriminant Analysis


Regularized Linear Discriminant Analysis Using a Nonlinear Covariance Matrix Estimator

arXiv.org Artificial Intelligence

Linear discriminant analysis (LDA) is a widely used technique for data classification. The method offers adequate performance in many classification problems, but it becomes inefficient when the data covariance matrix is ill-conditioned. This often occurs when the feature space's dimensionality is higher than or comparable to the training data size. Regularized LDA (RLDA) methods based on regularized linear estimators of the data covariance matrix have been proposed to cope with such a situation. The performance of RLDA methods is well studied, with optimal regularization schemes already proposed. In this paper, we investigate the capability of a positive semidefinite ridge-type estimator of the inverse covariance matrix that coincides with a nonlinear (NL) covariance matrix estimator. The estimator is derived by reformulating the score function of the optimal classifier utilizing linear estimation methods, which eventually results in the proposed NL-RLDA classifier. We derive asymptotic and consistent estimators of the proposed technique's misclassification rate under the assumptions of a double-asymptotic regime and multivariate Gaussian model for the classes. The consistent estimator, coupled with a one-dimensional grid search, is used to set the value of the regularization parameter required for the proposed NL-RLDA classifier. Performance evaluations based on both synthetic and real data demonstrate the effectiveness of the proposed classifier. The proposed technique outperforms state-of-art methods over multiple datasets. When compared to state-of-the-art methods across various datasets, the proposed technique exhibits superior performance.


Minimally Informed Linear Discriminant Analysis: training an LDA model with unlabelled data

arXiv.org Machine Learning

Linear Discriminant Analysis (LDA) is one of the oldest and most popular linear methods for supervised classification problems. In this paper, we demonstrate that it is possible to compute the exact projection vector from LDA models based on unlabelled data, if some minimal prior information is available. More precisely, we show that only one of the following three pieces of information is actually sufficient to compute the LDA projection vector if only unlabelled data are available: (1) the class average of one of the two classes, (2) the difference between both class averages (up to a scaling), or (3) the class covariance matrices (up to a scaling). These theoretical results are validated in numerical experiments, demonstrating that this minimally informed Linear Discriminant Analysis (MILDA) model closely matches the performance of a supervised LDA model. Furthermore, we show that the MILDA projection vector can be computed in a closed form with a computational cost comparable to LDA and is able to quickly adapt to non-stationary data, making it well-suited to use as an adaptive classifier.


Pivotal Estimation of Linear Discriminant Analysis in High Dimensions

arXiv.org Machine Learning

We consider the linear discriminant analysis problem in the high-dimensional settings. In this work, we propose PANDA(PivotAl liNear Discriminant Analysis), a tuning-insensitive method in the sense that it requires very little effort to tune the parameters. Moreover, we prove that PANDA achieves the optimal convergence rate in terms of both the estimation error and misclassification rate. Our theoretical results are backed up by thorough numerical studies using both simulated and real datasets. In comparison with the existing methods, we observe that our proposed PANDA yields equal or better performance, and requires substantially less effort in parameter tuning.


WDiscOOD: Out-of-Distribution Detection via Whitened Linear Discriminant Analysis

arXiv.org Artificial Intelligence

Deep neural networks are susceptible to generating overconfident yet erroneous predictions when presented with data beyond known concepts. This challenge underscores the importance of detecting out-of-distribution (OOD) samples in the open world. In this work, we propose a novel feature-space OOD detection score based on class-specific and class-agnostic information. Specifically, the approach utilizes Whitened Linear Discriminant Analysis to project features into two subspaces - the discriminative and residual subspaces - for which the in-distribution (ID) classes are maximally separated and closely clustered, respectively. The OOD score is then determined by combining the deviation from the input data to the ID pattern in both subspaces. The efficacy of our method, named WDiscOOD, is verified on the large-scale ImageNet-1k benchmark, with six OOD datasets that cover a variety of distribution shifts. WDiscOOD demonstrates superior performance on deep classifiers with diverse backbone architectures, including CNN and vision transformer. Furthermore, we also show that WDiscOOD more effectively detects novel concepts in representation spaces trained with contrastive objectives, including supervised contrastive loss and multi-modality contrastive loss.


XLDA: Linear Discriminant Analysis for Scaling Continual Learning to Extreme Classification at the Edge

arXiv.org Artificial Intelligence

Streaming Linear Discriminant Analysis (LDA) while proven in Class-incremental Learning deployments at the edge with limited classes (upto 1000), has not been proven for deployment in extreme classification scenarios. In this paper, we present: (a) XLDA, a framework for Class-IL in edge deployment where LDA classifier is proven to be equivalent to FC layer including in extreme classification scenarios, and (b) optimizations to enable XLDA-based training and inference for edge deployment where there is a constraint on available compute resources. We show up to 42x speed up using a batched training approach and up to 5x inference speedup with nearest neighbor search on extreme datasets like AliProducts (50k classes) and Google Landmarks V2 (81k classes)


A Network of Localized Linear Discriminants

Neural Information Processing Systems

The localized linear discriminant network (LLDN) has been designed to address classification problems containing relatively closely spaced data from different classes (encounter zones [1], the accuracy problem [2]). Locally trained hyper(cid:173) plane segments are an effective way to define the decision boundaries for these regions [3]. The LLD uses a modified perceptron training algorithm for effective discovery of separating hyperplane/sigmoid units within narrow boundaries. The basic unit of the network is the discriminant receptive field (DRF) which combines the LLD function with Gaussians representing the dispersion of the local training data with respect to the hyperplane. The DRF implements a local distance mea(cid:173) sure [4], and obtains the benefits of networks oflocalized units [5].


Two-Dimensional Linear Discriminant Analysis

Neural Information Processing Systems

Linear Discriminant Analysis (LDA) is a well-known scheme for feature extraction and dimension reduction. It has been used widely in many ap- plications involving high-dimensional data, such as face recognition and image retrieval. An intrinsic limitation of classical LDA is the so-called singularity problem, that is, it fails when all scatter matrices are singu- lar. A well-known approach to deal with the singularity problem is to apply an intermediate dimension reduction stage using Principal Com- ponent Analysis (PCA) before LDA. The algorithm, called PCA LDA, is used widely in face recognition.


Worst-Case Linear Discriminant Analysis

Neural Information Processing Systems

Dimensionality reduction is often needed in many applications due to the high dimensionality of the data involved. In this paper, we first analyze the scatter measures used in the conventional linear discriminant analysis (LDA) model and note that the formulation is based on the average-case view. Based on this analysis, we then propose a new dimensionality reduction method called worst-case linear discriminant analysis (WLDA) by defining new between-class and within-class scatter measures. This new model adopts the worst-case view which arguably is more suitable for applications such as classification. When the number of training data points or the number of features is not very large, we relax the optimization problem involved and formulate it as a metric learning problem.


Deep Linear Discriminant Analysis with Variation for Polycystic Ovary Syndrome Classification

arXiv.org Artificial Intelligence

The polycystic ovary syndrome diagnosis is a problem that can be leveraged using prognostication based learning procedures. Many implementations of PCOS can be seen with Machine Learning but the algorithms have certain limitations in utilizing the processing power graphical processing units. The simple machine learning algorithms can be improved with advanced frameworks using Deep Learning. The Linear Discriminant Analysis is a linear dimensionality reduction algorithm for classification that can be boosted in terms of performance using deep learning with Deep LDA, a transformed version of the traditional LDA. In this result oriented paper we present the Deep LDA implementation with a variation for prognostication of PCOS.


Approximately optimal domain adaptation with Fisher's Linear Discriminant Analysis

arXiv.org Artificial Intelligence

We propose a class of models based on Fisher's Linear Discriminant (FLD) for domain adaptation. The class entails a convex combination of two hypotheses: i) an average hypothesis representing previously encountered source tasks and ii) a hypothesis trained on a new target task. For a particular generative setting, we derive the expected risk of this combined hypothesis with respect to the target distribution and propose a computable approximation. This is then leveraged to estimate an optimal convex coefficient that exploits the bias-variance trade-off between source and target information to arrive at an optimal classifier for the target task. We study the effect of various generative parameter settings on the relative risks between the optimal hypothesis, hypothesis i), and hypothesis ii). Furthermore, we demonstrate the effectiveness of the proposed optimal classifier in several EEGand ECG-based classification problems and argue that the optimal classifier can be computed without access to direct information from any of the individual source tasks, leading to the preservation of privacy. We conclude by discussing further applications, limitations, and potential future directions. In problems with limited context-specific labeled data, machine learning models often fail to generalize well. These approaches are either ineffective or unavailable for problems where the input signals are highly variable across contexts or where a single model does not have access to a sufficient amount of data due to privacy or resource constraints (Mühlhoff, 2021). Note that the terms "context" and "task" can be used interchangeably here.