separability
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- North America > United States > Michigan (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology (1.00)
- Health & Medicine (0.92)
A Theoretical and Empirical Taxonomy of Imbalance in Binary Classification
Essomba, Rose Yvette Bandolo, Fokoué, Ernest
Class imbalance significantly degrades classification performance, yet its effects are rarely analyzed from a unified theoretical perspective. We propose a principled framework based on three fundamental scales: the imbalance coefficient $η$, the sample--dimension ratio $κ$, and the intrinsic separability $Δ$. Starting from the Gaussian Bayes classifier, we derive closed-form Bayes errors and show how imbalance shifts the discriminant boundary, yielding a deterioration slope that predicts four regimes: Normal, Mild, Extreme, and Catastrophic. Using a balanced high-dimensional genomic dataset, we vary only $η$ while keeping $κ$ and $Δ$ fixed. Across parametric and non-parametric models, empirical degradation closely follows theoretical predictions: minority Recall collapses once $\log(η)$ exceeds $Δ\sqrtκ$, Precision increases asymmetrically, and F1-score and PR-AUC decline in line with the predicted regimes. These results show that the triplet $(η,κ,Δ)$ provides a model-agnostic, geometrically grounded explanation of imbalance-induced deterioration.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Africa > South Africa > Western Cape > Cape Town (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Few-Shot Object Detection via Association and DIscrimination
Object detection has achieved substantial progress in the last decade. However, detecting novel classes with only few samples remains challenging, since deep learning under low data regime usually leads to a degraded feature space. Existing works employ a holistic fine-tuning paradigm to tackle this problem, where the model is first pre-trained on all base classes with abundant samples, and then it is used to carve the novel class feature space. Nonetheless, this paradigm is still imperfect. Durning fine-tuning, a novel class may implicitly leverage the knowledge of multiple base classes to construct its feature space, which induces a scattered feature space, hence violating the inter-class separability. To overcome these obstacles, we propose a two-step fine-tuning framework, Few-shot object detection via Association and DIscrimination (FADI), which builds up a discriminative feature space for each novel class with two integral steps.
Measuring the Measures: Discriminative Capacity of Representational Similarity Metrics Across Model Families
Wu, Jialin, Saha, Shreya, Bo, Yiqing, Khosla, Meenakshi
Representational similarity metrics are fundamental tools in neuroscience and AI, yet we lack systematic comparisons of their discriminative power across model families. We introduce a quantitative framework to evaluate representational similarity measures based on their ability to separate model families-across architectures (CNNs, Vision Transformers, Swin Transformers, ConvNeXt) and training regimes (supervised vs. self-supervised). Using three complementary separability measures-dprime from signal detection theory, silhouette coefficients and ROC-AUC, we systematically assess the discriminative capacity of commonly used metrics including RSA, linear predictivity, Procrustes, and soft matching. We show that separability systematically increases as metrics impose more stringent alignment constraints. Among mapping-based approaches, soft-matching achieves the highest separability, followed by Procrustes alignment and linear predictivity. Non-fitting methods such as RSA also yield strong separability across families. These results provide the first systematic comparison of similarity metrics through a separability lens, clarifying their relative sensitivity and guiding metric choice for large-scale model and brain comparisons.
- North America > United States > California > San Diego County > San Diego (0.05)
- Europe > France (0.05)
Proof of Concept for Mammography Classification with Enhanced Compactness and Separability Modules
This study presents a validation and extension of a recent methodological framework for medical image classification. While an improved ConvNeXt Tiny architecture, integrating Global Average and Max Pooling fusion (GAGM), lightweight channel attention (SEVector), and Feature Smoothing Loss (FSL), demonstrated promising results on Alzheimer MRI under CPU friendly conditions, our work investigates its transposability to mammography classification. Using a Kaggle dataset that consolidates INbreast, MIAS, and DDSM mammography collections, we compare a baseline CNN, ConvNeXt Tiny, and InceptionV3 backbones enriched with GAGM and SEVector modules. Results confirm the effectiveness of GAGM and SEVector in enhancing feature discriminability and reducing false negatives, particularly for malignant cases. In our experiments, however, the Feature Smoothing Loss did not yield measurable improvements under mammography classification conditions, suggesting that its effectiveness may depend on specific architectural and computational assumptions. Beyond validation, our contribution extends the original framework through multi metric evaluation (macro F1, per class recall variance, ROC/AUC), feature interpretability analysis (Grad CAM), and the development of an interactive dashboard for clinical exploration. As a perspective, we highlight the need to explore alternative approaches to improve intra class compactness and inter class separability, with the specific goal of enhancing the distinction between malignant and benign cases in mammography classification.
- Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Asymptotic analysis of shallow and deep forgetting in replay with Neural Collapse
Lanzillotta, Giulia, Meier, Damiano, Hofmann, Thomas
A persistent paradox in continual learning (CL) is that neural networks often retain linearly separable representations of past tasks even when their output predictions fail. We formalize this distinction as the gap between deep feature-space and shallow classifier-level forgetting. We reveal a critical asymmetry in Experience Replay: while minimal buffers successfully anchor feature geometry and prevent deep forgetting, mitigating shallow forgetting typically requires substantially larger buffer capacities. To explain this, we extend the Neural Collapse framework to the sequential setting. We characterize deep forgetting as a geometric drift toward out-of-distribution subspaces and prove that any non-zero replay fraction asymptotically guarantees the retention of linear separability. Conversely, we identify that the "strong collapse" induced by small buffers leads to rank-deficient covariances and inflated class means, effectively blinding the classifier to true population boundaries. By unifying CL with out-of-distribution detection, our work challenges the prevailing reliance on large buffers, suggesting that explicitly correcting these statistical artifacts could unlock robust performance with minimal replay.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > Ontario > Toronto (0.04)
TRINITY: An Evolved LLM Coordinator
Xu, Jinglue, Sun, Qi, Schwendeman, Peter, Nielsen, Stefan, Cetin, Edoardo, Tang, Yujin
Combining diverse foundation models is promising, but weight-merging is limited by mismatched architectures and closed APIs. The coordinator, comprising a compact language model ( 0.6B parameters) and a lightweight head ( 10K parameters), is optimized with an evolutionary strategy for efficient and adaptive delegation. Theoretical and empirical analyses highlight two key factors driving this success: (1) the coordinator's hidden-state representations provide rich contextualization of inputs, and (2) under high dimensionality and strict budget constraints, the separable Covariance Matrix Adaptation Evolution Strategy algorithm provides substantial advantages over RL, imitation learning, and random search, leveraging potential block-ε-separability. A prominent line of work involving large language models (LLMs) aspires to scale in line with empirical scaling laws, targeting gains by enlarging model size, training tokens, and compute (Kaplan et al., 2020; Hoffmann et al., 2022). Y et the extent to which such scaling remains efficient and yields sustained returns is uncertain and often resource intensive. An alternative at the micro level is model merging (Akiba et al., 2025; Wortsman et al., 2022; Y ang et al., 2024; Kuroki et al., 2024), which seeks parameter-level integration. However, this approach is frequently impractical due to architectural incompatibilities and the closed-source nature of many high-performing models. In light of these limitations, we adopt a macro-level approach: test-time model composition via coordination, which fuses the complementary strengths of multiple state-of-the-art models from diverse providers without modifying their weights. Leveraging prior data and training investments, this coordination can deliver performance improvements without retraining individual models. The central challenge for such a coordinator is to acquire a rich contextual understanding of a given query to make an effective decision. We posit that this signal can be efficiently extracted from the internal representation of a compact language model, specifically, its hidden states (Allen-Zhu & Li, 2023). In a self-attention-based transformer model, hidden states encode contextual representations of the input (and, after generation, the output) sequence.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > Michigan (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (2 more...)
- Research Report > Promising Solution (0.68)
- Research Report > New Finding (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
xEEGNet: Towards Explainable AI in EEG Dementia Classification
Zanola, Andrea, Tshimanga, Louis Fabrice, Del Pup, Federico, Baiesi, Marco, Atzori, Manfredo
This work presents xEEGNet, a novel, compact, and explainable neural network for EEG data analysis. It is fully interpretable and reduces overfitting through major parameter reduction. As an applicative use case, we focused on classifying common dementia conditions, Alzheimer's and frontotemporal dementia, versus controls. xEEGNet is broadly applicable to other neurological conditions involving spectral alterations. We initially used ShallowNet, a simple and popular model from the EEGNet-family. Its structure was analyzed and gradually modified to move from a "black box" to a more transparent model, without compromising performance. The learned kernels and weights were examined from a clinical standpoint to assess medical relevance. Model variants, including ShallowNet and the final xEEGNet, were evaluated using robust Nested-Leave-N-Subjects-Out cross-validation for unbiased performance estimates. Variability across data splits was explained using embedded EEG representations, grouped by class and set, with pairwise separability to quantify group distinction. Overfitting was assessed through training-validation loss correlation and training speed. xEEGNet uses only 168 parameters, 200 times fewer than ShallowNet, yet retains interpretability, resists overfitting, achieves comparable median performance (-1.5%), and reduces variability across splits. This variability is explained by embedded EEG representations: higher accuracy correlates with greater separation between test set controls and Alzheimer's cases, without significant influence from training data. xEEGNet's ability to filter specific EEG bands, learn band-specific topographies, and use relevant spectral features demonstrates its interpretability. While large deep learning models are often prioritized for performance, this study shows smaller architectures like xEEGNet can be equally effective in EEG pathology classification.
- Europe > Italy (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (4 more...)