AITopics | cifar 10

Collaborating Authors

cifar 10

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Architecture for Deep, Hierarchical Generative Models

Philip Bachman

Neural Information Processing SystemsMar-23-2026, 08:17:46 GMT

Neural Information Processing Systems http://nips.cc/

latent variable, matnet, module, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

One-Line-of-Code Data Mollification Improves Optimization of Likelihood-based Generative Models

Neural Information Processing SystemsFeb-8-2026, 04:35:11 GMT

However, this is not true in practice.

artificial intelligence, international conference, machine learning, (12 more...)

Neural Information Processing Systems

Country:

Europe > France > Hauts-de-France > Nord > Lille (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > Monterey County > Monterey (0.04)
(3 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Dynamic Spectral Backpropagation for Efficient Neural Network Training

Muthuraman, Mannmohan

arXiv.org Artificial IntelligenceMay-30-2025

Dynamic Spectral Backpropagation (DSBP) enhances neural network training under resource constraints by projecting gradients onto principal eigenvectors, reducing complexity and promoting flat minima. Five extensions are proposed, dynamic spectral inference, spectral architecture optimization, spectral meta learning, spectral transfer regularization, and Lie algebra inspired dynamics, to address challenges in robustness, fewshot learning, and hardware efficiency. Supported by a third order stochastic differential equation (SDE) and a PAC Bayes limit, DSBP outperforms Sharpness Aware Minimization (SAM), Low Rank Adaptation (LoRA), and Model Agnostic Meta Learning (MAML) on CIFAR 10, Fashion MNIST, MedMNIST, and Tiny ImageNet, as demonstrated through extensive experiments and visualizations. Future work focuses on scalability, bias mitigation, and ethical considerations.

artificial intelligence, dsbp, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2505.23369

Country: North America > United States (0.68)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.62)

Add feedback

Optimizing Data Augmentation through Bayesian Model Selection

Matymov, Madi, Tran, Ba-Hien, Kampffmeyer, Michael, Heinonen, Markus, Filippone, Maurizio

arXiv.org Machine LearningMay-29-2025

Data Augmentation (DA) has become an essential tool to improve robustness and generalization of modern machine learning. However, when deciding on DA strategies it is critical to choose parameters carefully, and this can be a daunting task which is traditionally left to trial-and-error or expensive optimization based on validation performance. In this paper, we counter these limitations by proposing a novel framework for optimizing DA. In particular, we take a probabilistic view of DA, which leads to the interpretation of augmentation parameters as model (hyper)-parameters, and the optimization of the marginal likelihood with respect to these parameters as a Bayesian model selection problem. Due to its intractability, we derive a tractable Evidence Lower BOund (ELBO), which allows us to optimize augmentation parameters jointly with model parameters. We provide extensive theoretical results on variational approximation quality, generalization guarantees, invariance properties, and connections to empirical Bayes. Through experiments on computer vision tasks, we show that our approach improves calibration and yields robust performance over fixed or no augmentation. Our work provides a rigorous foundation for optimizing DA through Bayesian principles with significant potential for robust machine learning.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

2505.21813

Country:

Europe > Norway (0.04)
Asia > Middle East > Saudi Arabia (0.04)
Asia > Middle East > Jordan (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

A Mutual Information Perspective on Federated Contrastive Learning

Louizos, Christos, Reisser, Matthias, Korzhenkov, Denis

arXiv.org Artificial IntelligenceMay-3-2024

We investigate contrastive learning in the federated setting through the lens of SimCLR and multi-view mutual information maximization. In doing so, we uncover a connection between contrastive representation learning and user verification; by adding a user verification loss to each client's local SimCLR loss we recover a lower bound to the global multi-view mutual information. To accommodate for the case of when some labelled data are available at the clients, we extend our SimCLR variant to the federated semi-supervised setting. We see that a supervised SimCLR objective can be obtained with two changes: a) the contrastive loss is computed between datapoints that share the same label and b) we require an additional auxiliary head that predicts the correct labels from either of the two views. Along with the proposed SimCLR extensions, we also study how different sources of non-i.i.d.-ness can impact the performance of federated unsupervised learning through global mutual information maximization; we find that a global objective is beneficial for some sources of non-i.i.d.-ness but can be detrimental for others. We empirically evaluate our proposed extensions in various tasks to validate our claims and furthermore demonstrate that our proposed modifications generalize to other pretraining methods.

learning, representation, simclr, (15 more...)

arXiv.org Artificial Intelligence

2405.02081

Country:

Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

An Architecture for Deep, Hierarchical Generative Models Philip Bachman

Neural Information Processing SystemsMar-12-2024, 10:59:49 GMT

We present an architecture which lets us train deep, directed generative models with many layers of latent variables. We include deterministic paths between all latent variables and the generated output, and provide a richer set of connections between computations for inference and generation, which enables more effective communication of information throughout the model during training. To improve performance on natural images, we incorporate a lightweight autoregressive model in the reconstruction distribution. These techniques permit end-to-end training of models with 10+ layers of latent variables. Experiments show that our approach achieves state-of-the-art performance on standard image modelling benchmarks, can expose latent class structure in the absence of label information, and can provide convincing imputations of occluded regions in natural images.

latent variable, matnet, module, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.62)

Add feedback

RADIN: Souping on a Budget

Menes, Thibaut, Risser-Maroix, Olivier

arXiv.org Artificial IntelligenceJan-31-2024

Model Soups, extending Stochastic Weights Averaging (SWA), combine models fine-tuned with different hyperparameters. Yet, their adoption is hindered by computational challenges due to subset selection issues. In this paper, we propose to speed up model soups by approximating soups performance using averaged ensemble logits performances. Theoretical insights validate the congruence between ensemble logits and weight averaging soups across any mixing ratios. Our Resource ADjusted soups craftINg (RADIN) procedure stands out by allowing flexible evaluation budgets, enabling users to adjust his budget of exploration adapted to his resources while increasing performance at lower budget compared to previous greedy approach (up to 4% on ImageNet).

budget, ensemble, init, (17 more...)

arXiv.org Artificial Intelligence

2401.1779

Country: Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

One-Line-of-Code Data Mollification Improves Optimization of Likelihood-based Generative Models

Tran, Ba-Hien, Franzese, Giulio, Michiardi, Pietro, Filippone, Maurizio

arXiv.org Artificial IntelligenceDec-21-2023

Generative Models (GMs) have attracted considerable attention due to their tremendous success in various domains, such as computer vision where they are capable to generate impressive realistic-looking images. Likelihood-based GMs are attractive due to the possibility to generate new data by a single model evaluation. However, they typically achieve lower sample quality compared to state-of-the-art score-based diffusion models (DMs). This paper provides a significant step in the direction of addressing this limitation. The idea is to borrow one of the strengths of score-based DMs, which is the ability to perform accurate density estimation in low-density regions and to address manifold overfitting by means of data mollification. We connect data mollification through the addition of Gaussian noise to Gaussian homotopy, which is a well-known technique to improve optimization. Data mollification can be implemented by adding one line of code in the optimization loop, and we demonstrate that this provides a boost in generation quality of likelihood-based GMs, without computational overheads. We report results on image data sets with popular likelihood-based GMs, including variants of variational autoencoders and normalizing flows, showing large improvements in FID score.

international conference, mollification, proceedings, (10 more...)

arXiv.org Artificial Intelligence

2305.189

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Can we infer the presence of Differential Privacy in Deep Learning models' weights? Towards more secure Deep Learning

Jiménez-López, null, Daniel, null, Rodríguez-Barroso, null, Nuria, null, Luzón, null, Victoria, M., Herrera, null, Francisco, null

arXiv.org Artificial IntelligenceNov-20-2023

Differential Privacy (DP) is a key property to protect data and models from integrity attacks. In the Deep Learning (DL) field, it is commonly implemented through the Differentially Private Stochastic Gradient Descent (DP-SGD). However, when a model is shared or released, there is no way to check whether it is differentially private, that is, it required to trust the model provider. This situation poses a problem when data privacy is mandatory, specially with current data regulations, as the presence of DP can not be certificated consistently by any third party. Thus, we face the challenge of determining whether a DL model has been trained with DP, according to the title question: Can we infer the presence of Differential Privacy in Deep Learning models' weights? Since the DP-SGD significantly changes the training process of a DL model, we hypothesize that DP leaves an imprint in the weights of a DL model, which can be used to predict whether a model has been trained with DP regardless of its architecture and the training dataset. In this paper, we propose to employ the imprint in model weights of using DP to infer the presence of DP training in a DL model. To substantiate our hypothesis, we developed an experimental methodology based on two datasets of weights of DL models, each with models with and without DP training and a meta-classifier to infer whether DP was used in the training process of a DL model, by accessing its weights. We accomplish both, the removal of the requirement of a trusted model provider and a strong foundation for this interesting line of research. Thus, our contribution is an additional layer of security on top of the strict private requirements of DP training in DL models, towards to DL models.

cifar 10, dataset, dl model, (14 more...)

arXiv.org Artificial Intelligence

2311.11717

Country: