AITopics | Luzi, Lorenzo

Collaborating Authors

Luzi, Lorenzo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Removing Bias from Maximum Likelihood Estimation with Model Autophagy

Mayer, Paul, Luzi, Lorenzo, Siahkoohi, Ali, Johnson, Don H., Baraniuk, Richard G.

arXiv.org Machine LearningMay-22-2024

We propose autophagy penalized likelihood estimation (PLE), an unbiased alternative to maximum likelihood estimation (MLE) which is more fair and less susceptible to model autophagy disorder (madness). Model autophagy refers to models trained on their own output; PLE ensures the statistics of these outputs coincide with the data statistics. This enables PLE to be statistically unbiased in certain scenarios where MLE is biased. When biased, MLE unfairly penalizes minority classes in unbalanced datasets and exacerbates the recently discovered issue of self-consuming generative modeling. Theoretical and empirical results show that 1) PLE is more fair to minority classes and 2) PLE is more stable in a self-consumed setting. Furthermore, we provide a scalable and portable implementation of PLE with a hypernetwork framework, allowing existing deep learning architectures to be easily trained with PLE. Finally, we show PLE can bridge the gap between Bayesian and frequentist paradigms in statistics.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

2405.13977

Country:

North America > United States > Texas (0.15)
North America > United States > Utah (0.14)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Double Descent and Other Interpolation Phenomena in GANs

Luzi, Lorenzo, Dar, Yehuda, Baraniuk, Richard

arXiv.org Artificial IntelligenceApr-30-2024

We study overparameterization in generative adversarial networks (GANs) that can interpolate the training data. We show that overparameterization can improve generalization performance and accelerate the training process. We study the generalization error as a function of latent space dimension and identify two main behaviors, depending on the learning setting. First, we show that overparameterized generative models that learn distributions by minimizing a metric or $f$-divergence do not exhibit double descent in generalization errors; specifically, all the interpolating solutions achieve the same generalization error. Second, we develop a novel pseudo-supervised learning approach for GANs where the training utilizes pairs of fabricated (noise) inputs in conjunction with real output samples. Our pseudo-supervised setting exhibits double descent (and in some cases, triple descent) of generalization errors. We combine pseudo-supervision with overparameterization (i.e., overly large latent space dimension) to accelerate training while matching or even surpassing generalization performance without pseudo-supervision. While our analysis focuses mostly on linear models, we also apply important insights for improving generalization of nonlinear, multilayer GANs.

artificial intelligence, machine learning, unsup, (16 more...)

arXiv.org Artificial Intelligence

2106.04003

Country:

North America > United States > New York (0.14)
Europe > Denmark > Capital Region > Kongens Lyngby (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

Using Higher-Order Moments to Assess the Quality of GAN-generated Image Features

Luzi, Lorenzo, Jenne, Helen, Murray, Ryan, Marrero, Carlos Ortiz

arXiv.org Artificial IntelligenceOct-31-2023

The rapid advancement of Generative Adversarial Networks (GANs) necessitates the need to robustly evaluate these models. Among the established evaluation criteria, the Fr\'{e}chet Inception Distance (FID) has been widely adopted due to its conceptual simplicity, fast computation time, and strong correlation with human perception. However, FID has inherent limitations, mainly stemming from its assumption that feature embeddings follow a Gaussian distribution, and therefore can be defined by their first two moments. As this does not hold in practice, in this paper we explore the importance of third-moments in image feature data and use this information to define a new measure, which we call the Skew Inception Distance (SID). We prove that SID is a pseudometric on probability distributions, show how it extends FID, and present a practical method for its computation. Our numerical experiments support that SID either tracks with FID or, in some cases, aligns more closely with human perception when evaluating image features of ImageNet data.

artificial intelligence, fid, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2310.20636

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Self-Consuming Generative Models Go MAD

Alemohammad, Sina, Casco-Rodriguez, Josue, Luzi, Lorenzo, Humayun, Ahmed Imtiaz, Babaei, Hossein, LeJeune, Daniel, Siahkoohi, Ali, Baraniuk, Richard G.

arXiv.org Artificial IntelligenceJul-4-2023

Seismic advances in generative AI algorithms for imagery, text, and other data types has led to the temptation to use synthetic data to train next-generation models. Repeating this process creates an autophagous ("self-consuming") loop whose properties are poorly understood. We conduct a thorough analytical and empirical analysis using state-of-the-art generative image models of three families of autophagous loops that differ in how fixed or fresh real training data is available through the generations of training and in whether the samples from previousgeneration models have been biased to trade off data quality versus diversity. Our primary conclusion across all scenarios is that without enough fresh real data in each generation of an autophagous loop, future generative models are doomed to have their quality (precision) or diversity (recall) progressively decrease.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2307.0185

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.67)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.88)

Add feedback

Frozen Overparameterization: A Double Descent Perspective on Transfer Learning of Deep Neural Networks

Dar, Yehuda, Luzi, Lorenzo, Baraniuk, Richard G.

arXiv.org Artificial IntelligenceJun-12-2023

We study the generalization behavior of transfer learning of deep neural networks (DNNs). We adopt the overparameterization perspective -- featuring interpolation of the training data (i.e., approximately zero train error) and the double descent phenomenon -- to explain the delicate effect of the transfer learning setting on generalization performance. We study how the generalization behavior of transfer learning is affected by the dataset size in the source and target tasks, the number of transferred layers that are kept frozen in the target DNN training, and the similarity between the source and target tasks. We show that the test error evolution during the target DNN training has a more significant double descent effect when the target training dataset is sufficiently large. In addition, a larger source training dataset can yield a slower target DNN training. Moreover, we demonstrate that the number of frozen layers can determine whether the transfer learning is effectively underparameterized or overparameterized and, in turn, this may induce a freezing-wise double descent phenomenon that determines the relative success or failure of learning. Also, we show that the double descent phenomenon may make a transfer from a less related source task better than a transfer from a more related source task. We establish our results using image classification experiments with the ResNet, DenseNet and the vision transformer (ViT) architectures.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2211.11074

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluating generative networks using Gaussian mixtures of image features

Luzi, Lorenzo, Marrero, Carlos Ortiz, Wynar, Nile, Baraniuk, Richard G., Henry, Michael J.

arXiv.org Artificial IntelligenceJul-22-2022

We develop a measure for evaluating the performance of generative networks given two sets of images. A popular performance measure currently used to do this is the Fr\'echet Inception Distance (FID). FID assumes that images featurized using the penultimate layer of Inception-v3 follow a Gaussian distribution, an assumption which cannot be violated if we wish to use FID as a metric. However, we show that Inception-v3 features of the ImageNet dataset are not Gaussian; in particular, every single marginal is not Gaussian. To remedy this problem, we model the featurized images using Gaussian mixture models (GMMs) and compute the 2-Wasserstein distance restricted to GMMs. We define a performance measure, which we call WaM, on two sets of images by using Inception-v3 (or another classifier) to featurize the images, estimate two GMMs, and use the restricted $2$-Wasserstein distance to compare the GMMs. We experimentally show the advantages of WaM over FID, including how FID is more sensitive than WaM to imperceptible image perturbations. By modelling the non-Gaussian features obtained from Inception-v3 as GMMs and using a GMM metric, we can more accurately evaluate generative network performance.

artificial intelligence, machine learning, perturbation, (17 more...)

arXiv.org Artificial Intelligence

2110.0524

Country: North America > United States > New York (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Ensembles of Generative Adversarial Networks for Disconnected Data

Luzi, Lorenzo, Balestriero, Randall, Baraniuk, Richard G.

arXiv.org Machine LearningJun-25-2020

Most current computer vision datasets are composed of disconnected sets, such as images from different classes. We prove that distributions of this type of data cannot be represented with a continuous generative network without error. They can be represented in two ways: With an ensemble of networks or with a single network with truncated latent space. We show that ensembles are more desirable than truncated distributions in practice. We construct a regularized optimization problem that establishes the relationship between a single continuous GAN, an ensemble of GANs, conditional GANs, and Gaussian Mixture GANs. This regularization can be computed efficiently, and we show empirically that our framework has a performance sweet spot which can be found with hyperparameter tuning. This ensemble framework allows better performance than a single continuous GAN or cGAN while maintaining fewer total parameters.

artificial intelligence, ensemble, neural network, (14 more...)

arXiv.org Machine Learning

2006.146

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback