AITopics | kurtosis

Collaborating Authors

kurtosis

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

3ce3bd7d63a2c9c81983cc8e9bd02ae5-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 13:14:16 GMT

We start by restating the setup in which our algorithm operates. The type of ICA considered in our work assumes the following generative model. There are dsources recorded T times forming the columns of S:= [s1,...,sT] Rd T whose components s1t,...,sdt are assumed non-Gaussian and independent. Without loss of generality, we assume that each source has zero-mean, unit variance, and finite and distinct kurtosis, a common assumption among kurtosis-based ICA methods [12]. The kurtosis of a random variable v is defined as kurt[v] = E (v E(v))4 / E (v E(v))2 2. Finally, sources are assumed to be mixed through a linear system, i.e., there exists a full rank mixing matrix, A Rd d, producing the d-dimensional mixture, xt, expressed as xt = Ast t {1,...,T} .

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

A 1/R Law for Kurtosis Contrast in Balanced Mixtures

Bi, Yuda, Xiao, Wenjun, Bai, Linhao, Calhoun, Vince D

arXiv.org Machine LearningFeb-27-2026

Abstract--Kurtosis-based Independent Component Analysis (ICA) weakens in wide, balanced mixtures. We also show that purification--selecting m R sign-consistent sources--restores R-independent contrast Ω(1/m), with a simple data-driven heuristic. Synthetic experiments validate the predicted decay, the T crossover, and contrast recovery. Independent Component Analysis (ICA) recovers statistically independent latent sources from linear mixtures and is identifiable whenever at most one source is Gaussian [1]. Excess kurtosis--the standardized fourth cumulant--is a central contrast function [9], and kurtosis-type nonlinearities remain standard in FastICA.

artificial intelligence, independent component analysis, machine learning, (15 more...)

arXiv.org Machine Learning

2602.22334

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.05)
North America > United States > District of Columbia > Washington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Health Care Technology (0.70)
Health & Medicine > Therapeutic Area > Neurology (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

986292a930c3692168b177a770025ab3-Paper-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 20:53:28 GMT

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > Dominican Republic (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Energy (0.46)
Education > Educational Setting (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

bd3611971089d466ab4ca96a20f7ab13-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsFeb-16-2026, 20:29:13 GMT

artificial intelligence, kurtosis, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Africa > Sudan > Khartoum State > Khartoum (0.05)
(7 more...)

Genre: Research Report > New Finding (0.45)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Sensing and Signal Processing > Image Processing (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Robust Quantization: One Model to Rule Them All Moran Shkolnik Brian Chmiel Ron Banner Gil Shomron Y ury Nahshan Alex Bronstein Uri Weiser

Neural Information Processing SystemsFeb-8-2026, 02:54:59 GMT

Low-precision arithmetic is one of the key techniques for reducing deep neural networks computational costs and fitting larger networks into smaller devices.

artificial intelligence, machine learning, quantization, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Haifa District > Haifa (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Learning Beyond the Gaussian Data: Learning Dynamics of Neural Networks on an Expressive and Cumulant-Controllable Data Model

Ure, Onat, Demir, Samet, Dogan, Zafer

arXiv.org Machine LearningFeb-3-2026

We study the effect of high-order statistics of data on the learning dynamics of neural networks (NNs) by using a moment-controllable non-Gaussian data model. Considering the expressivity of two-layer neural networks, we first construct the data model as a generative two-layer NN where the activation function is expanded by using Hermite polynomials. This allows us to achieve interpretable control over high-order cumulants such as skewness and kurtosis through the Hermite coefficients while keeping the data model realistic. Using samples generated from the data model, we perform controlled online learning experiments with a two-layer NN. Our results reveal a moment-wise progression in training: networks first capture low-order statistics such as mean and covariance, and progressively learn high-order cumulants. Finally, we pretrain the generative model on the Fashion-MNIST dataset and leverage the generated samples for further experiments. The results of these additional experiments confirm our conclusions and show the utility of the data model in a real-world scenario. Overall, our proposed approach bridges simplified data assumptions and practical data complexity, which offers a principled framework for investigating distributional effects in machine learning and signal processing.

artificial intelligence, cumulant, machine learning, (16 more...)

arXiv.org Machine Learning

2602.02153

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models

Pham, Cuong, Dung, Hoang Anh, Nguyen, Cuong C., Le, Trung, Carneiro, Gustavo, Cai, Jianfei, Do, Thanh-Toan

arXiv.org Artificial IntelligenceNov-25-2025

Large language models require significant computational resources for deployment, making quantization essential for practical applications. However, the main obstacle to effective quantization lies in systematic outliers in activations and weights, which cause substantial LLM performance degradation, especially at low-bit settings. While existing transformation-based methods like affine and rotation transformations successfully mitigate outliers, they apply the homogeneous transformation setting, i.e., using the same transformation types across all layers, ignoring the heterogeneous distribution characteristics within LLMs. In this paper, we propose an adaptive transformation selection framework that systematically determines optimal transformations on a per-layer basis. To this end, we first formulate transformation selection as a differentiable optimization problem to achieve the accurate transformation type for each layer. However, searching for optimal layer-wise transformations for every model is computationally expensive. To this end, we establish the connection between weight distribution kurtosis and accurate transformation type. Specifically, we propose an outlier-guided layer selection method using robust $z$-score normalization that achieves comparable performance to differentiable search with significantly reduced overhead. Comprehensive experiments on LLaMA family models demonstrate that our adaptive approach consistently outperforms the widely-used fixed transformation settings. For example, our method achieves an improvement of up to 4.58 perplexity points and a 2.11% gain in average six-task zero-shot accuracy under aggressive W3A3K2V2 quantization settings for the LLaMA-3-8B model compared to the current best existing method, FlatQuant, demonstrating the necessity of heterogeneous transformation selection for optimal LLM quantization.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.17809

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback

A Scale Free Algorithm for Stochastic Bandits with Bounded Kurtosis

Tor Lattimore

Neural Information Processing SystemsNov-21-2025, 14:06:04 GMT

Existing strategies for finite-armed stochastic bandits mostly depend on a parameter of scale that must be known in advance. Sometimes this is in the form of a bound on the payoffs, or the knowledge of a variance or subgaussian parameter. The notable exceptions are the analysis of Gaussian bandits with unknown mean and variance by Cowan et al. [2015] and of uniform distributions with unknown support [Cowan and Katehakis, 2015]. The results derived in these specialised cases are generalised here to the non-parametric setup, where the learner knows only a bound on the kurtosis of the noise, which is a scale free measure of the extremity of outliers.

data mining, kurtosis, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Long Beach (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

Gradient-Weight Alignment as a Train-Time Proxy for Generalization in Classification Tasks

Hölzl, Florian A., Rueckert, Daniel, Kaissis, Georgios

arXiv.org Artificial IntelligenceOct-30-2025

Robust validation metrics remain essential in contemporary deep learning, not only to detect overfitting and poor generalization, but also to monitor training dynamics. In the supervised classification setting, we investigate whether interactions between training data and model weights can yield such a metric that both tracks generalization during training and attributes performance to individual training samples. We introduce Gradient-Weight Alignment (GWA), quantifying the coherence between per-sample gradients and model weights. We show that effective learning corresponds to coherent alignment, while misalignment indicates deteriorating generalization. GWA is efficiently computable during training and reflects both sample-specific contributions and dataset-wide learning dynamics. Extensive experiments show that GWA accurately predicts optimal early stopping, enables principled model comparisons, and identifies influential training samples, providing a validation-set-free approach for model analysis directly from the training data.

alignment, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.2548

Country:

Europe (0.93)
North America > United States (0.46)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback