AITopics

Industry: Information Technology > Services (0.84)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Neural Information Processing SystemsSep-30-2025, 11:48:17 GMT

Faster Ridge Regression via the Subsampled Randomized Hadamard Transform

We propose a fast algorithm for ridge regression when the number of features is much larger than the number of observations ($p \gg n$). The standard way to solve ridge regression in this setting works in the dual space and gives a running time of $O(n^2p)$. Our algorithm (SRHT-DRR) runs in time $O(np\log(n))$ and works by preconditioning the design matrix by a Randomized Walsh-Hadamard Transform with a subsequent subsampling of features. We provide risk bounds for our SRHT-DRR algorithm in the fixed design setting and show experimental results on synthetic and real datasets.

artificial intelligence, faster ridge regression, machine learning, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Neural Information Processing SystemsSep-30-2025, 09:58:05 GMT

Consistent Binary Classification with Generalized Performance Metrics

Performance metrics for binary classification are designed to capture tradeoffs between four fundamental population quantities: true positives, false positives, true negatives and false negatives. Despite significant interest from theoretical and applied communities, little is known about either optimal classifiers or consistent algorithms for optimizing binary classification performance metrics beyond a few special cases. We consider a fairly large family of performance metrics given by ratios of linear combinations of the four fundamental population quantities. This family includes many well known binary classification metrics such as classification accuracy, AM measure, F-measure and the Jaccard similarity coefficient as special cases. Our analysis identifies the optimal classifiers as the sign of the thresholded conditional probability of the positive class, with a performance metric-dependent threshold.

algorithm, consistent binary classification, generalized performance metric, (9 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Neural Information Processing SystemsSep-30-2025, 09:21:47 GMT

Optimizing F-Measures by Cost-Sensitive Classification

We present a theoretical analysis of F-measures for binary, multiclass and multilabel classification. These performance measures are non-linear, but in many scenarios they are pseudo-linear functions of the per-class false negative/false positive rate. Based on this observation, we present a general reduction of F-measure maximization to cost-sensitive classification with unknown costs. We then propose an algorithm with provable guarantees to obtain an approximately optimal classifier for the F-measure by solving a series of cost-sensitive classification problems. The strength of our analysis is to be valid on any dataset and any class of classifiers, extending the existing theoretical results on F-measures, which are asymptotic in nature. We present numerical experiments to illustrate the relative importance of cost asymmetry and thresholding when learning linear classifiers on various F-measure optimization tasks.

cost-sensitive classification, name change, optimizing f-measure, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Neural Information Processing SystemsSep-30-2025, 08:56:31 GMT

On the Statistical Consistency of Plug-in Classifiers for Non-decomposable Performance Measures

We study consistency properties of algorithms for non-decomposable performance measures that cannot be expressed as a sum of losses on individual data points, such as the F-measure used in text retrieval and several other performance measures used in class imbalanced settings. While there has been much work on designing algorithms for such performance measures, there is limited understanding of the theoretical properties of these algorithms. Recently, Ye et al. (2012) showed consistency results for two algorithms that optimize the F-measure, but their results apply only to an idealized setting, where precise knowledge of the underlying probability distribution (in the form of the estimate' of the class probability, and provide a general methodology to show consistency of these methods for any non-decomposable measure that can be expressed as a continuous function of true positive rate (TPR) and true negative rate (TNR), and for which the Bayes optimal classifier is the class probability function thresholded suitably. We use this template to derive consistency results for plug-in algorithms for the F-measure and for the geometric mean of TPR and precision; to our knowledge, these are the first such results for these measures. In addition, for continuous distributions, we show consistency of plug-in algorithms for any performance measure that is a continuous and monotonically increasing function of TPR and TNR. Experimental results confirm our theoretical findings.

algorithm, plug-in classifier, statistical consistency, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Artificial IntelligenceSep-30-2025

Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

Chen, Zhisheng, Zhang, Yingwei, Lan, Qizhen, Liu, Tianyu, Wang, Huacan, Ding, Yi, Jia, Ziyu, Chen, Ronghao, Wang, Kun, Zhou, Xinliang

Foundation models pretrained on various and unlabeled data have demonstrated significant success in natural language and vision, but their application to electroencephalography (EEG) remains challenged due to the signal's unique properties. Existing brain foundation models that inherit architectures designed for text or images lead to three limitations in pre-training: 1) conflating time-domain waveform patterns with frequency-domain rhythmic features in a single processing stream, 2) ignoring the critical spatial topology of electrodes with different standards, and 3) reliance on the inflexible, dense network to process functionally distinct EEG patterns. To address these challenges, we introduce the Unified Neural Topological Foundation Model (Uni-NTFM), which is designed based on neuroscience principles to produce universal and interpretable representations. Uni-NTFM integrates three core innovations: 1) a decoupled architecture parallelly encodes time, frequency, and raw signal representations before performing cross-domain feature integration; 2) a topological embedding mechanism to unify electrodes from different international standards and generate structured input sequences for brain regions; and 3) a Mixture-of-Experts neural Transformer that efficiently scales model capacity by routing signal patterns to specialized subnetworks. The largest model, Uni-NTFM$_{large}$, has a record-breaking 1.9B parameters and was pretrained on over 28,000 hours of diverse EEG data via a dual-domain masked reconstruction objective. Uni-NTFM significantly outperforms existing task-specific methods and foundation models across nine distinct downstream tasks under both linear probing and fine-tuning settings, demonstrating a superior ability to learn universal representations of brain activity.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.24222

Country:

Europe > Austria > Styria > Graz (0.04)
North America > United States > Texas (0.04)
North America > United States > Alabama (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Millard, Louise AC, Flach, Peter A

Evaluating classification performance across operating contexts: A comparison of decision curve analysis and cost curves

Classification models typically predict a score and use a decision threshold to produce a classification. Appropriate model evaluation should carefully consider the context in which a model will be used, including the relative value of correct classifications of positive versus negative examples, which affects the threshold that should be used. Decision curve analysis (DCA) and cost curves are model evaluation approaches that assess the expected utility and expected loss of prediction models, respectively, across decision thresholds. We compared DCA and cost curves to determine how they are related, and their strengths and limitations. We demonstrate that decision curves are closely related to a specific type of cost curve called a Brier curve. Both curves are derived assuming model scores are calibrated and setting the classification threshold using the relative value of correct positive and negative classifications, and the x-axis of both curves are equivalent. Net benefit (used for DCA) and Brier loss (used for Brier curves) will always choose the same model as optimal at any given threshold. Across thresholds, differences in Brier loss are comparable whereas differences in net benefit cannot be compared. Brier curves are more generally applicable (when a wider range of thresholds are plausible), and the area under the Brier curve is the Brier score. We demonstrate that reference lines common in each space can be included in either and suggest the upper envelope decision curve as a useful comparison for DCA showing the possible gain in net benefit that could be achieved through recalibration alone.

brier curve, cost curve, threshold, (15 more...)

2509.24608

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Bristol (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Preference-Based Dynamic Ranking Structure Recognition

Lu, Nan, Shi, Jian, Tian, Xin-Yu

Preference-based data often appear complex and noisy but may conceal underlying homogeneous structures. This paper introduces a novel framework of ranking structure recognition for preference-based data. We first develop an approach to identify dynamic ranking groups by incorporating temporal penalties into a spectral estimation for the celebrated Bradley-Terry model. To detect structural changes, we introduce an innovative objective function and present a practicable algorithm based on dynamic programming. Theoretically, we establish the consistency of ranking group recognition by exploiting properties of a random `design matrix' induced by a reversible Markov chain. We also tailor a group inverse technique to quantify the uncertainty in item ability estimates. Additionally, we prove the consistency of structure change recognition, ensuring the robustness of the proposed framework. Experiments on both synthetic and real-world datasets demonstrate the practical utility and interpretability of our approach.

assumption 3, change point, theorem 3, (15 more...)

2509.24493

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Sports (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Mhaskar, H. N., O'Dowd, Ryan

A signal separation view of classification

The problem of classification in machine learning has often been approached in terms of function approximation. In this paper, we propose an alternative approach for classification in arbitrary compact metric spaces which, in theory, yields both the number of classes, and a perfect classification using a minimal number of queried labels. Our approach uses localized trigonometric polynomial kernels initially developed for the point source signal separation problem in signal processing. Rather than point sources, we argue that the various classes come from different probability distributions. The localized kernel technique developed for separating point sources is then shown to separate the supports of these distributions. This is done in a hierarchical manner in our MASC algorithm to accommodate touching/overlapping class boundaries. We illustrate our theory on several simulated and real life datasets, including the Salinas and Indian Pines hyperspectral datasets and a document dataset.

algorithm, estimation, masc, (14 more...)

2509.2414

Country:

North America > United States > California > Los Angeles County > Claremont (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Cai, Zhongteng, Khalili, Mohammad Mahdi, Zhang, Xueru

Demographic-Agnostic Fairness without Harm

As machine learning (ML) algorithms are increasingly used in social domains to make predictions about humans, there is a growing concern that these algorithms may exhibit biases against certain social groups. Numerous notions of fairness have been proposed in the literature to measure the unfairness of ML. Among them, one class that receives the most attention is \textit{parity-based}, i.e., achieving fairness by equalizing treatment or outcomes for different social groups. However, achieving parity-based fairness often comes at the cost of lowering model accuracy and is undesirable for many high-stakes domains like healthcare. To avoid inferior accuracy, a line of research focuses on \textit{preference-based} fairness, under which any group of individuals would experience the highest accuracy and collectively prefer the ML outcomes assigned to them if they were given the choice between various sets of outcomes. However, these works assume individual demographic information is known and fully accessible during training. In this paper, we relax this requirement and propose a novel \textit{demographic-agnostic fairness without harm (DAFH)} optimization algorithm, which jointly learns a group classifier that partitions the population into multiple groups and a set of decoupled classifiers associated with these groups. Theoretically, we conduct sample complexity analysis and show that our method can outperform the baselines when demographic information is known and used to train decoupled classifiers. Experiments on both synthetic and real data validate the proposed method.

classifier, decoupled classifier, fairness, (16 more...)

2509.24077

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)