AITopics

doi: 10.1007/s00521-022-07332-z

2510.21898

Country:

North America > United States (0.28)
Europe > Spain (0.28)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Yadav, Yogesh, Yadav, Sandeep K, Vijay, Vivek, Dixit, Ambesh

Crystal Systems Classification of Phosphate-Based Cathode Materials Using Machine Learning for Lithium-Ion Battery

arXiv.org Artificial IntelligenceSep-16-2025

The physical and chemical characteristics of cathodes used in batteries are derived from the lithium-ion phosphate cathodes crystalline arrangement, which is pivotal to the overall battery performance. Therefore, the correct prediction of the crystal system is essential to estimate the properties of cathodes. This study applies machine learning classification algorithms for predicting the crystal systems, namely monoclinic, orthorhombic, and triclinic, related to Li P (Mn, Fe, Co, Ni, V) O based Phosphate cathodes. The data used in this work is extracted from the Materials Project. Feature evaluation showed that cathode properties depend on the crystal structure, and optimized classification strategies lead to better predictability. Ensemble machine learning algorithms such as Random Forest, Extremely Randomized Trees, and Gradient Boosting Machines have demonstrated the best predictive capabilities for crystal systems in the Monte Carlo cross-validation test. Additionally, sequential forward selection (SFS) is performed to identify the most critical features influencing the prediction accuracy for different machine learning models, with Volume, Band gap, and Sites as input features ensemble machine learning algorithms such as Random Forest (80.69%), Extremely Randomized Tree (78.96%), and Gradient Boosting Machine (80.40%) approaches lead to the maximum accuracy towards crystallographic classification with stability and the predicted materials can be the potential cathode materials for lithium ion batteries.

artificial intelligence, cathode material, machine learning, (14 more...)

2509.10532

Country: Asia > India (0.30)

Genre: Research Report > New Finding (0.69)

Industry:

Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.67)

Oh, Seungyeon, Park, Seongoh, Park, Hoyoung

Nonparametric Linear Discriminant Analysis for High Dimensional Matrix-Valued Data

arXiv.org Machine LearningJul-29-2025

This paper addresses classification problems with matrix-valued data, which commonly arises in applications such as neuroimaging and signal processing. Building on the assumption that the data from each class follows a matrix normal distribution, we propose a novel extension of Fisher's Linear Discriminant Analysis (LDA) tailored for matrix-valued observations. To effectively capture structural information while maintaining estimation flexibility, we adopt a nonparametric empirical Bayes framework based on Nonparametric Maximum Likelihood Estimation (NPMLE), applied to vectorized and scaled matrices. The NPMLE method has been shown to provide robust, flexible, and accurate estimates for vector-valued data with various structures in the mean vector or covariance matrix. By leveraging its strengths, our method is effectively generalized to the matrix setting, thereby improving classification performance. Through extensive simulation studies and real data applications, including electroencephalography (EEG) and magnetic resonance imaging (MRI) analysis, we demonstrate that the proposed method consistently outperforms existing approaches across a variety of data structures.

artificial intelligence, machine learning, matrix-valued data, (14 more...)

2507.19028

Country: Asia > South Korea > Seoul > Seoul (0.05)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningJul-23-2025

Structural Effect and Spectral Enhancement of High-Dimensional Regularized Linear Discriminant Analysis

Zhang, Yonghan, Pu, Zhangni, Yan, Lu, Hu, Jiang

Regularized linear discriminant analysis (RLDA) is a widely used tool for classification and dimensionality reduction, but its performance in high-dimensional scenarios is inconsistent. Existing theoretical analyses of RLDA often lack clear insight into how data structure affects classification performance. To address this issue, we derive a non-asymptotic approximation of the misclassification rate and thus analyze the structural effect and structural adjustment strategies of RLDA. Based on this, we propose the Spectral Enhanced Discriminant Analysis (SEDA) algorithm, which optimizes the data structure by adjusting the spiked eigenvalues of the population covariance matrix. By developing a new theoretical result on eigenvectors in random matrix theory, we derive an asymptotic approximation on the misclassification rate of SEDA. The bias correction algorithm and parameter selection strategy are then obtained. Experiments on synthetic and real datasets show that SEDA achieves higher classification accuracy and dimensionality reduction compared to existing LDA methods.

artificial intelligence, machine learning, misclassification rate, (18 more...)

2507.16682

Country:

Europe > Russia (0.04)
Asia > Russia (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Shen, Cencheng, Dong, Yuexiao

Linear Discriminant Analysis with Gradient Optimization on Covariance Inverse

arXiv.org Machine LearningJun-10-2025

Linear discriminant analysis (LDA) is a fundamental method in statistical pattern recognition and classification, achieving Bayes optimality under Gaussian assumptions. However, it is well-known that classical LDA may struggle in high-dimensional settings due to instability in covariance estimation. In this work, we propose LDA with gradient optimization (LDA-GO), a new approach that directly optimizes the inverse covariance matrix via gradient descent. The algorithm parametrizes the inverse covariance matrix through Cholesky factorization, incorporates a low-rank extension to reduce computational complexity, and considers a multiple-initialization strategy, including identity initialization and warm-starting from the classical LDA estimates. The effectiveness of LDA-GO is demonstrated through extensive multivariate simulations and real-data experiments.

artificial intelligence, covariance matrix, machine learning, (17 more...)

2506.06845

Country:

North America > United States > Wisconsin (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > Indiana > Hamilton County > Fishers (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Discriminant Analysis (0.61)

Surineela, Sai Vijay Kumar, Kanakamalla, Prathyusha, Harikumar, Harigovind, Ghosh, Tomojit

A Convex formulation for linear discriminant analysis

arXiv.org Artificial IntelligenceMar-17-2025

The recent surge in multisource data collection has drastically increased data dimensionality, particularly in omics analysis, where gene expression data from microarrays or nextgeneration sequencing can exceed 50,000 measurements [32]. High-dimensional datasets often contain noisy, redundant, missing, or irrelevant features, which can degrade the performance of pattern recognition tasks [28]. The acquisition of such high-dimensional datasets necessitates innovative techniques that can effectively handle large-scale data while remaining robust to noise [30]. DR is widely applied as an essential step to extract meaningful features enabling more effective data visualization, feature extraction, and improved downstream predictive performance [12, 24]. With the advent of deep neural networks (DNNs) such as large language models (LLMs), convolutional neural networks (CNNs), and transformers, DR techniques may seem less prominent. However, despite the success of these complex architectures, linear dimensionality reduction remains a powerful and practical approach due to its interpretability, computational efficiency, and robustness in high-dimensional, low-sample-size (HDLSS) regimes [28]. Deep learning models excel at learning hierarchical representations but pose significant challenges. They require large amounts of labeled data, extensive hyper-parameter tuning, and substantial computational resources. Additionally, these models often function as black boxes, offering little interpretability of their decision-making processes [26].

artificial intelligence, dimension, machine learning, (16 more...)

2503.13623

Country:

North America > United States > Ohio > Montgomery County > Dayton (0.04)
North America > United States > New York (0.04)
North America > United States > New Mexico (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Reza, Md Shihab, Mahmud, Monirul Islam, Abeer, Ifti Azad, Ahmed, Nova

Linear Discriminant Analysis in Credit Scoring: A Transparent Hybrid Model Approach

arXiv.org Artificial IntelligenceDec-5-2024

The development of computing has made credit scoring approaches possible, with various machine learning (ML) and deep learning (DL) techniques becoming more and more valuable. While complex models yield more accurate predictions, their interpretability is often weakened, which is a concern for credit scoring that places importance on decision fairness. As features of the dataset are a crucial factor for the credit scoring system, we implement Linear Discriminant Analysis (LDA) as a feature reduction technique, which reduces the burden of the models complexity. We compared 6 different machine learning models, 1 deep learning model, and a hybrid model with and without using LDA. From the result, we have found our hybrid model, XG-DNN, outperformed other models with the highest accuracy of 99.45% and a 99% F1 score with LDA. Lastly, to interpret model decisions, we have applied 2 different explainable AI techniques named LIME (local) and Morris Sensitivity Analysis (global). Through this research, we showed how feature reduction techniques can be used without affecting the performance and explainability of the model, which can be very useful in resource-constrained settings to optimize the computational workload.

accuracy, lda, reduction technique, (14 more...)

2412.04183

Country:

North America > United States > New York (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Research Report > New Finding (0.47)

Industry: Banking & Finance > Credit (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-3-2024

Deep Learning, Machine Learning, Advancing Big Data Analytics and Management

Hsieh, Weiche, Bi, Ziqian, Chen, Keyu, Peng, Benji, Zhang, Sen, Xu, Jiawei, Wang, Jinlang, Yin, Caitlyn Heqi, Zhang, Yichao, Feng, Pohsun, Wen, Yizhu, Wang, Tianyang, Li, Ming, Liang, Chia Xin, Ren, Jintao, Niu, Qian, Chen, Silin, Yan, Lawrence K. Q., Xu, Han, Tseng, Hong-Ming, Song, Xinyuan, Jing, Bowen, Yang, Junjie, Song, Junhao, Liu, Junyu, Liu, Ming

Advancements in artificial intelligence, machine learning, and deep learning have catalyzed the transformation of big data analytics and management into pivotal domains for research and application. This work explores the theoretical foundations, methodological advancements, and practical implementations of these technologies, emphasizing their role in uncovering actionable insights from massive, high-dimensional datasets. The study presents a systematic overview of data preprocessing techniques, including data cleaning, normalization, integration, and dimensionality reduction, to prepare raw data for analysis. Core analytics methodologies such as classification, clustering, regression, and anomaly detection are examined, with a focus on algorithmic innovation and scalability. Furthermore, the text delves into state-of-the-art frameworks for data mining and predictive modeling, highlighting the role of neural networks, support vector machines, and ensemble methods in tackling complex analytical challenges. Special emphasis is placed on the convergence of big data with distributed computing paradigms, including cloud and edge computing, to address challenges in storage, computation, and real-time analytics. The integration of ethical considerations, including data privacy and compliance with global standards, ensures a holistic perspective on data management. Practical applications across healthcare, finance, marketing, and policy-making illustrate the real-world impact of these technologies. Through comprehensive case studies and Python-based implementations, this work equips researchers, practitioners, and data enthusiasts with the tools to navigate the complexities of modern data analytics. It bridges the gap between theory and practice, fostering the development of innovative solutions for managing and leveraging data in the era of artificial intelligence.

data mining, information retrieval, machine learning, (25 more...)

2412.02187

Country:

Europe (0.67)
Asia > China (0.45)
North America > United States > Wisconsin (0.14)
(2 more...)

Genre:

Workflow (1.00)
Overview (1.00)
Research Report > Experimental Study (0.67)

Industry:

Transportation (1.00)
Leisure & Entertainment (1.00)
Information Technology > Services (1.00)
(10 more...)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
(4 more...)

arXiv.org Machine LearningJun-30-2024

Weighted Missing Linear Discriminant Analysis: An Explainable Approach for Classification with Missing Data

Vo, Tuan L., Dang, Uyen, Nguyen, Thu

As Artificial Intelligence (AI) models are gradually being adopted in real-life applications, the explainability of the model used is critical, especially in high-stakes areas such as medicine, finance, etc. Among the commonly used models, Linear Discriminant Analysis (LDA) is a widely used classification tool that is also explainable thanks to its ability to model class distributions and maximize class separation through linear feature combinations. Nevertheless, real-world data is frequently incomplete, presenting significant challenges for classification tasks and model explanations. In this paper, we propose a novel approach to LDA under missing data, termed \textbf{\textit{Weighted missing Linear Discriminant Analysis (WLDA)}}, to directly classify observations in data that contains missing values without imputation effectively by estimating the parameters directly on missing data and use a weight matrix for missing values to penalize missing entries during classification. Furthermore, we also analyze the theoretical properties and examine the explainability of the proposed technique in a comprehensive manner. Experimental results demonstrate that WLDA outperforms conventional methods by a significant margin, particularly in scenarios where missing values are present in both training and test sets.

dataset, decision boundary, matrix, (14 more...)

2407.0071

Country:

Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > California > Orange County > Irvine (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Discriminant Analysis (0.81)

arXiv.org Artificial IntelligenceJun-4-2024

Fast and Scalable Multi-Kernel Encoder Classifier

Shen, Cencheng

This paper introduces a new kernel-based classifier by viewing kernel matrices as generalized graphs and leveraging recent progress in graph embedding techniques. The proposed method facilitates fast and scalable kernel matrix embedding, and seamlessly integrates multiple kernels to enhance the learning process. Our theoretical analysis offers a population-level characterization of this approach using random variables. Empirically, our method demonstrates superior running time compared to standard approaches such as support vector machines and two-layer neural network, while achieving comparable classification accuracy across various simulated and real datasets.

classifier, encoder, matrix, (13 more...)

2406.02189

Country:

North America > United States > Delaware > New Castle County > Newark (0.14)
North America > United States > New York (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.56)