AITopics | Ajroldi, Niccolò

Collaborating Authors

Ajroldi, Niccolò

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

When, Where and Why to Average Weights?

Ajroldi, Niccolò, Orvieto, Antonio, Geiping, Jonas

arXiv.org Artificial IntelligenceFeb-10-2025

Averaging checkpoints along the training trajectory is a simple yet powerful approach to improve the generalization performance of Machine Learning models and reduce training time. Motivated by these potential gains, and in an effort to fairly and thoroughly benchmark this technique, we present an extensive evaluation of averaging techniques in modern Deep Learning, which we perform using AlgoPerf \citep{dahl_benchmarking_2023}, a large-scale benchmark for optimization algorithms. We investigate whether weight averaging can reduce training time, improve generalization, and replace learning rate decay, as suggested by recent literature. Our evaluation across seven architectures and datasets reveals that averaging significantly accelerates training and yields considerable efficiency gains, at the price of a minimal implementation and memory cost, while mildly improving generalization across all considered workloads. Finally, we explore the relationship between averaging and learning rate annealing and show how to optimally combine the two to achieve the best performances.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.06761

Country: North America > United States > New York (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Loss Landscape Characterization of Neural Networks without Over-Parametrization

Islamov, Rustem, Ajroldi, Niccolò, Orvieto, Antonio, Lucchi, Aurelien

arXiv.org Machine LearningOct-24-2024

Optimization methods play a crucial role in modern machine learning, powering the remarkable empirical achievements of deep learning models. These successes are even more remarkable given the complex non-convex nature of the loss landscape of these models. Yet, ensuring the convergence of optimization methods requires specific structural conditions on the objective function that are rarely satisfied in practice. One prominent example is the widely recognized Polyak-Lojasiewicz (PL) inequality, which has gained considerable attention in recent years. However, validating such assumptions for deep neural networks entails substantial and often impractical levels of over-parametrization. In order to address this limitation, we propose a novel class of functions that can characterize the loss landscape of modern deep models without requiring extensive over-parametrization and can also include saddle points. Crucially, we prove that gradient-based optimizers possess theoretical guarantees of convergence under this assumption. Finally, we validate the soundness of our new function class through both theoretical analysis and empirical experimentation across a diverse range of deep learning models.

artificial intelligence, dist, machine learning, (16 more...)

arXiv.org Machine Learning

2410.12455

Country: Europe > Germany (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Conformal Prediction Bands for Two-Dimensional Functional Time Series

Ajroldi, Niccolò, Diquigiovanni, Jacopo, Fontana, Matteo, Vantini, Simone

arXiv.org Machine LearningJul-18-2023

Functional data analysis (FDA) (Ramsay and Silverman 2005) is naturally apt to represent and model this kind of data, as it allows preserving their continuous nature, and provides a rigorous mathematical framework. Among the others, Zhou and Pan 2014 analyzed temperature surfaces, presenting two approaches for Functional Principal Component Analysis (FPCA) of functions defined on a non-rectangular domain, Porro-Muñoz et al. 2014 focuses on image processing using FDA, while a novel regularization technique for Gaussian random fields on a rectangular domain has been proposed by Rakêt 2010 and applied to 2D electrophoresis images. Another bivariate smoothing approach in a penalized regression framework has been introduced by Ivanescu and Andrada 2013, allowing for the estimation of functional parameters of two-dimensional functional data. As shown by Gervini 2010, even mortality rates can be interpreted as two-dimensional functional data. Whereas in all the reviewed works functions are assumed to be realization of iid or at least exchangeable random objects, to the best of our knowledge there is no literature focusing on forecasting time-dependent two-dimensional functional data. In this work, we focus on time series of surfaces, representing them as two-dimensional Functional Time Series (FTS).

artificial intelligence, machine learning, prediction band, (15 more...)

arXiv.org Machine Learning

doi: 10.1016/j.csda.2023.107821

2207.13656

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Public Health (0.74)
Government > Regional Government (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Sensing and Signal Processing > Image Processing (0.86)

Add feedback