statistical structure
On GANs and GMMs
A longstanding problem in machine learning is to find unsupervised methods that can learn the statistical structure of high dimensional signals. In recent years, GANs have gained much attention as a possible solution to the problem, and in particular have shown the ability to generate remarkably realistic high resolution sampled images. At the same time, many authors have pointed out that GANs may fail to model the full distribution (mode collapse) and that using the learned models for anything other than generating samples may be very difficult. In this paper, we examine the utility of GANs in learning statistical models of images by comparing them to perhaps the simplest statistical model, the Gaussian Mixture Model. First, we present a simple method to evaluate generative models based on relative proportions of samples that fall into predetermined bins.
On GANs and GMMs
A longstanding problem in machine learning is to find unsupervised methods that can learn the statistical structure of high dimensional signals. In recent years, GANs have gained much attention as a possible solution to the problem, and in particular have shown the ability to generate remarkably realistic high resolution sampled images. At the same time, many authors have pointed out that GANs may fail to model the full distribution (mode collapse) and that using the learned models for anything other than generating samples may be very difficult. In this paper, we examine the utility of GANs in learning statistical models of images by comparing them to perhaps the simplest statistical model, the Gaussian Mixture Model. First, we present a simple method to evaluate generative models based on relative proportions of samples that fall into predetermined bins.
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.05)
- North America > Canada > Quebec > Montreal (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Bayesian entropy estimation for binary spike train data using parametric prior knowledge
Shannon's entropy is a basic quantity in information theory, and a fundamental building block for the analysis of neural codes. Estimating the entropy of a discrete distribution from samples is an important and difficult problem that has received considerable attention in statistics and theoretical neuroscience. However, neural responses have characteristic statistical structure that generic entropy estimators fail to exploit. For example, existing Bayesian entropy estimators make the naive assumption that all spike words are equally likely a priori, which makes for an inefficient allocation of prior probability mass in cases where spikes are sparse. Here we develop Bayesian estimators for the entropy of binary spike trains using priors designed to flexibly exploit the statistical structure of simultaneously-recorded spike responses.
Learning Rate Should Scale Inversely with High-Order Data Moments in High-Dimensional Online Independent Component Analysis
Gultekin, M. Oguzhan, Demir, Samet, Dogan, Zafer
We investigate the impact of high-order moments on the learning dynamics of an online Independent Component Analysis (ICA) algorithm under a high-dimensional data model composed of a weighted sum of two non-Gaussian random variables. This model allows precise control of the input moment structure via a weighting parameter. Building on an existing ordinary differential equation (ODE)-based analysis in the high-dimensional limit, we demonstrate that as the high-order moments increase, the algorithm exhibits slower convergence and demands both a lower learning rate and greater initial alignment to achieve informative solutions. Our findings highlight the algorithm's sensitivity to the statistical structure of the input data, particularly its moment characteristics. Furthermore, the ODE framework reveals a critical learning rate threshold necessary for learning when moments approach their maximum. These insights motivate future directions in moment-aware initialization and adaptive learning rate strategies to counteract the degradation in learning speed caused by high non-Gaussianity, thereby enhancing the robustness and efficiency of ICA in complex, high-dimensional settings.
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- Asia > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
Linear CNNs Discover the Statistical Structure of the Dataset Using Only the Most Dominant Frequencies
Pinson, Hannah, Lenaerts, Joeri, Ginis, Vincent
A general theory on how the implicit structure in the network arises and how it depends on the We here present a stepping stone towards a deeper structure of the dataset has yet to be developed. Here we understanding of convolutional neural networks derive such a theory for the specific case of two-layer, linear (CNNs) in the form of a theory of learning in CNNs, and we provide experiments that show how our insights linear CNNs. Through analyzing the gradient descent relate to the evolution of learning in deep, non-linear equations, we discover that the evolution CNNs. Our approach is inspired by previous work on the of the network during training is determined by learning dynamics in linear fully connected neural networks the interplay between the dataset structure and the (FCNN) (Saxe et al., 2014; 2019), but we uncover the role convolutional network structure. We show that linear of convolutions and show how they fundamentally alter the CNNs discover the statistical structure of the internal dynamics of learning. We start by discussing the dataset with non-linear, ordered, stage-like transitions, two involved structures in terms of singular value decompositions and that the speed of discovery changes (SVD): on the one hand the SVD of the input-output depending on the relationship between the dataset correlation matrix, representing the statistical dataset structure and the convolutional network structure.
- Europe > Belgium > Flanders (0.04)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Bayesian entropy estimation for binary spike train data using parametric prior knowledge
Shannon's entropy is a basic quantity in information theory, and a fundamental building block for the analysis of neural codes. Estimating the entropy of a discrete distribution from samples is an important and difficult problem that has received considerable attention in statistics and theoretical neuroscience. However, neural responses have characteristic statistical structure that generic entropy estimators fail to exploit. For example, existing Bayesian entropy estimators make the naive assumption that all spike words are equally likely a priori, which makes for an inefficient allocation of prior probability mass in cases where spikes are sparse. Here we develop Bayesian estimators for the entropy of binary spike trains using priors designed to flexibly exploit the statistical structure of simultaneously-recorded spike responses.
What Makes Neural Networks Fragile
What do the images below have in common? Most readers will quickly catch on that they are all seats, as in places to sit. It may have taken you less than a second to recognize this common characteristic. If I heed Andrew Ng's suggestion that anything a human can do in less than a second can be automated by a Neural Network, then I should be able to create an image classifier that recognizes seats. I could write a standard classifier using off-the-shelf python libraries.
Bayesian entropy estimation for binary spike train data using parametric prior knowledge
Archer, Evan W., Park, Il Memming, Pillow, Jonathan W.
Shannon's entropy is a basic quantity in information theory, and a fundamental building block for the analysis of neural codes. Estimating the entropy of a discrete distribution from samples is an important and difficult problem that has received considerable attention in statistics and theoretical neuroscience. However, neural responses have characteristic statistical structure that generic entropy estimators fail to exploit. For example, existing Bayesian entropy estimators make the naive assumption that all spike words are equally likely a priori, which makes for an inefficient allocation of prior probability mass in cases where spikes are sparse. Here we develop Bayesian estimators for the entropy of binary spike trains using priors designed to flexibly exploit the statistical structure of simultaneously-recorded spike responses.
On GANs and GMMs
Richardson, Eitan, Weiss, Yair
A longstanding problem in machine learning is to find unsupervised methods that can learn the statistical structure of high dimensional signals. In recent years, GANs have gained much attention as a possible solution to the problem, and in particular have shown the ability to generate remarkably realistic high resolution sampled images. At the same time, many authors have pointed out that GANs may fail to model the full distribution ("mode collapse") and that using the learned models for anything other than generating samples may be very difficult. In this paper, we examine the utility of GANs in learning statistical models of images by comparing them to perhaps the simplest statistical model, the Gaussian Mixture Model. First, we present a simple method to evaluate generative models based on relative proportions of samples that fall into predetermined bins.