AITopics | shrinkage function

Collaborating Authors

shrinkage function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Self-Distillation is Optimal Among Spectral Shrinkage Estimators in Spiked Covariance Models

Lecoiu, Radu, Mukherjee, Debarghya, Sur, Pragya

arXiv.org Machine LearningMay-19-2026

Self-distillation has emerged as a promising technique for improving model performance in modern machine learning systems. We develop the statistical foundations of self-distillation in spiked covariance models, by introducing and analyzing a broad class of estimators, namely spectral shrinkage estimators. We establish that for spiked covariance matrices with $s$ spikes, $s$-step self-distillation achieves optimal performance among spectral shrinkage estimators, outperforming well-known estimators in statistics and machine learning. Moreover, we show that $s$ steps are necessary for optimality: any $(s-k)$-step distilled estimator is strictly suboptimal for $1 \leq k \leq s$. For the special subclass of isotropic covariances, we show that optimally tuned Ridge regression performs best among spectral shrinkage estimators. We also study a federated approach where multiple data centers share spectral shrinkage estimators and a common server seeks to aggregate them to achieve optimal performance. In this case, we find that the best local rule again takes the form of self-distillation, though it differs from the optimal rule when data are hosted centrally on a single server. Together, our results elucidate why self-distillation improves predictive performance and provide a broader statistical framework connecting it with classical shrinkage-based methods.

artificial intelligence, machine learning, theorem 3, (19 more...)

arXiv.org Machine Learning

2605.17778

Country: North America > United States > New York (0.27)

Genre:

Research Report > New Finding (0.34)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Deep Unfolding for MIMO Signal Detection

Ge, Hangli, Koshizuka, Noboru

arXiv.org Artificial IntelligenceJul-30-2025

--In this paper, we propose a deep unfolding neural network-based MIMO detector that incorporates complex-valued computations using Wirtinger calculus. The method, referred as Dynamic Partially Shrinkage Thresholding (DPST), enables efficient, interpretable, and low-complexity MIMO signal detection. Unlike prior approaches that rely on real-valued approximations, our method operates natively in the complex domain, aligning with the fundamental nature of signal processing tasks. The proposed algorithm requires only a small number of trainable parameters, allowing for simplified training. Numerical results demonstrate that the proposed method achieves superior detection performance with fewer iterations and lower computational complexity, making it a practical solution for next-generation massive MIMO systems.

algorithm, artificial intelligence, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2507.21152

Genre: Research Report > New Finding (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Translating Diffusion, Wavelets, and Regularisation into Residual Networks

Alt, Tobias, Weickert, Joachim, Peter, Pascal

arXiv.org Machine LearningFeb-7-2020

Convolutional neural networks (CNNs) often perform well, but their stability is poorly understood. To address this problem, we consider the simple prototypical problem of signal denoising, where classical approaches such as nonlinear diffusion, wavelet-based methods and regularisation offer provable stability guarantees. To transfer such guarantees to CNNs, we interpret numerical approximations of these classical methods as a specific residual network (ResNet) architecture. This leads to a dictionary which allows to translate diffusivities, shrinkage functions, and regularisers into activation functions, and enables a direct communication between the four research communities. On the CNN side, it does not only inspire new families of nonmonotone activation functions, but also introduces intrinsically stable architectures for an arbitrary number of layers.

activation function, architecture, neural network, (11 more...)

arXiv.org Machine Learning

2002.02753

Country:

North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > New York (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(13 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adversarial Margin Maximization Networks

Yan, Ziang, Guo, Yiwen, Zhang, Changshui

arXiv.org Machine LearningNov-13-2019

The tremendous recent success of deep neural networks (DNNs) has sparked a surge of interest in understanding their predictive ability. Unlike the human visual system which is able to generalize robustly and learn with little supervision, DNNs normally require a massive amount of data to learn new concepts. In addition, research works also show that DNNs are vulnerable to adversarial examples-maliciously generated images which seem perceptually similar to the natural ones but are actually formed to fool learning models, which means the models have problem generalizing to unseen data with certain type of distortions. In this paper, we analyze the generalization ability of DNNs comprehensively and attempt to improve it from a geometric point of view. We propose adversarial margin maximization (AMM), a learning-based regularization which exploits an adversarial perturbation as a proxy. It encourages a large margin in the input space, just like the support vector machines. With a differentiable formulation of the perturbation, we train the regularized DNNs simply through back-propagation in an end-to-end manner. Experimental results on various datasets (including MNIST, CIFAR-10/100, SVHN and ImageNet) and different DNN architectures demonstrate the superiority of our method over previous state-of-the-arts. Code and models for reproducing our results will be made publicly available.

architecture, experiment, generalization ability, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/TPAMI.2019.2948348

1911.05916

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Hubei Province > Wuhan (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)

Add feedback

Learning Convex Regularizers for Optimal Bayesian Denoising

Nguyen, Ha Q., Bostan, Emrah, Unser, Michael

arXiv.org Machine LearningMay-16-2017

We propose a data-driven algorithm for the maximum a posteriori (MAP) estimation of stochastic processes from noisy observations. The primary statistical properties of the sought signal is specified by the penalty function (i.e., negative logarithm of the prior probability density function). Our alternating direction method of multipliers (ADMM)-based approach translates the estimation task into successive applications of the proximal mapping of the penalty function. Capitalizing on this direct link, we define the proximal operator as a parametric spline curve and optimize the spline coefficients by minimizing the average reconstruction error for a given training set. The key aspects of our learning method are that the associated penalty function is constrained to be convex and the convergence of the ADMM iterations is proven. As a result of these theoretical guarantees, adaptation of the proposed framework to different levels of measurement noise is extremely simple and does not require any retraining. We apply our method to estimation of both sparse and non-sparse models of L\'{e}vy processes for which the minimum mean square error (MMSE) estimators are available. We carry out a single training session and perform comparisons at various signal-to-noise ratio (SNR) values. Simulations illustrate that the performance of our algorithm is practically identical to the one of the MMSE estimator irrespective of the noise power.

artificial intelligence, machine learning, shrinkage function, (18 more...)

arXiv.org Machine Learning

doi: 10.1109/TSP.2017.2777407

1705.05591

Country: North America > United States (0.67)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)

Add feedback

Sparse Code Shrinkage: Denoising by Nonlinear Maximum Likelihood Estimation

Hyvärinen, Aapo, Hoyer, Patrik O., Oja, Erkki

Neural Information Processing SystemsDec-31-1999

One of the simplest methods is to use linear transformations of the observed data.

artificial intelligence, machine learning, sparse, (14 more...)

Neural Information Processing Systems

Country: Europe > Finland > Uusimaa > Helsinki (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)

Add feedback

Sparse Code Shrinkage: Denoising by Nonlinear Maximum Likelihood Estimation

Hyvärinen, Aapo, Hoyer, Patrik O., Oja, Erkki

Neural Information Processing SystemsDec-31-1999

One of the simplest methods is to use linear transformations of the observed data.

noise, shrinkage function, sparse, (11 more...)

Neural Information Processing Systems

Country: Europe > Finland > Uusimaa > Helsinki (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)

Add feedback

Sparse Code Shrinkage: Denoising by Nonlinear Maximum Likelihood Estimation

Hyvärinen, Aapo, Hoyer, Patrik O., Oja, Erkki

Neural Information Processing SystemsDec-31-1999

Sparse coding is a method for finding a representation of data in which each of the components of the representation is only rarely significantly active. Such a representation is closely related to redundancy reductionand independent component analysis, and has some neurophysiological plausibility. In this paper, we show how sparse coding can be used for denoising. Using maximum likelihood estimation of nongaussian variables corrupted by gaussian noise, we show how to apply a shrinkage nonlinearity on the components of sparse coding so as to reduce noise. Furthermore, we show how to choose the optimal sparse coding basis for denoising.

artificial intelligence, machine learning, sparse, (15 more...)

Neural Information Processing Systems

Country: Europe > Finland (0.14)

Technology: