AITopics | Donini, Michele

Collaborating Authors

Donini, Michele

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hyperparameter Optimization in Machine Learning

Franceschi, Luca, Donini, Michele, Perrone, Valerio, Klein, Aaron, Archambeau, Cédric, Seeger, Matthias, Pontil, Massimiliano, Frasconi, Paolo

arXiv.org Machine LearningOct-30-2024

Hyperparameters are configuration variables controlling the behavior of machine learning algorithms. They are ubiquitous in machine learning and artificial intelligence and the choice of their values determine the effectiveness of systems based on these technologies. Manual hyperparameter search is often unsatisfactory and becomes unfeasible when the number of hyperparameters is large. Automating the search is an important step towards automating machine learning, freeing researchers and practitioners alike from the burden of finding a good set of hyperparameters by trial and error. In this survey, we present a unified treatment of hyperparameter optimization, providing the reader with examples and insights into the state-of-the-art. We cover the main families of techniques to automate hyperparameter search, often referred to as hyperparameter optimization or tuning, including random and quasi-random search, bandit-, model- and gradient- based approaches. We further discuss extensions, including online, constrained, and multi-objective formulations, touch upon connections with other fields such as meta-learning and neural architecture search, and conclude with open questions and future research directions.

evolutionary algorithm, large language model, machine learning, (19 more...)

arXiv.org Machine Learning

2410.22854

Country:

Europe (0.67)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Information Technology (0.67)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(6 more...)

Add feedback

Evaluating Large Language Models with fmeval

Schwöbel, Pola, Franceschi, Luca, Zafar, Muhammad Bilal, Vasist, Keerthan, Malhotra, Aman, Shenhar, Tomer, Tailor, Pinal, Yilmaz, Pinar, Diamond, Michael, Donini, Michele

arXiv.org Artificial IntelligenceJul-15-2024

fmeval is an open source library to evaluate large language models (LLMs) in a range of tasks. It helps practitioners evaluate their model for task performance and along multiple responsible AI dimensions. This paper presents the library and exposes its underlying design principles: simplicity, coverage, extensibility and performance. We then present how these were implemented in the scientific and engineering choices taken when developing fmeval. A case study demonstrates a typical use case for the library: picking a suitable model for a question answering task. We close by discussing limitations and further work in the development of the library. fmeval can be found at https://github.com/aws/fmeval.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2407.12872

Country:

Asia (0.93)
North America > United States (0.68)
Europe > Germany > Berlin (0.14)

Genre: Research Report (0.40)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback

Explaining Probabilistic Models with Distributional Values

Franceschi, Luca, Donini, Michele, Archambeau, Cédric, Seeger, Matthias

arXiv.org Artificial IntelligenceFeb-15-2024

A large branch of explainable machine learning is grounded in cooperative game theory. However, research indicates that game-theoretic explanations may mislead or be hard to interpret. We argue that often there is a critical mismatch between what one wishes to explain (e.g. the output of a classifier) and what current methods such as SHAP explain (e.g. the scalar probability of a class). This paper addresses such gap for probabilistic models by generalising cooperative games and value operators. We introduce the distributional values, random variables that track changes in the model output (e.g. flipping of the predicted class) and derive their analytic expressions for games with Gaussian, Bernoulli and Categorical payoffs. We further establish several characterising properties, and show that our framework provides fine-grained and insightful explanations with case studies on vision and language models.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2402.09947

Country:

Europe > Germany (0.14)
North America > United States (0.14)

Genre: Research Report (0.83)

Industry: Leisure & Entertainment > Games (0.48)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)

Add feedback

Geographical Erasure in Language Generation

Schwöbel, Pola, Golebiowski, Jacek, Donini, Michele, Archambeau, Cédric, Pruthi, Danish

arXiv.org Artificial IntelligenceOct-23-2023

Large language models (LLMs) encode vast amounts of world knowledge. However, since these models are trained on large swaths of internet data, they are at risk of inordinately capturing information about dominant groups. This imbalance can propagate into generated language. In this work, we study and operationalise a form of geographical erasure, wherein language models underpredict certain countries. We demonstrate consistent instances of erasure across a range of LLMs. We discover that erasure strongly correlates with low frequencies of country mentions in the training corpus. Lastly, we mitigate erasure by finetuning using a custom objective.

geographical erasure, large language model, natural language, (2 more...)

arXiv.org Artificial Intelligence

2310.14777

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.73)

Add feedback

Efficient fair PCA for fair representation learning

Kleindessner, Matthäus, Donini, Michele, Russell, Chris, Zafar, Muhammad Bilal

arXiv.org Artificial IntelligenceFeb-26-2023

We revisit the problem of fair principal component analysis (PCA), where the goal is to learn the best low-rank linear approximation of the data that obfuscates demographic information. We propose a conceptually simple approach that allows for an analytic solution similar to standard PCA and can be kernelized. Our methods have the same complexity as standard PCA, or kernel PCA, and run much faster than existing methods for fair PCA based on semidefinite programming or manifold optimization, while achieving similar results.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2302.13319

Country: Europe > Germany (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

Fortuna: A Library for Uncertainty Quantification in Deep Learning

Detommaso, Gianluca, Gasparin, Alberto, Donini, Michele, Seeger, Matthias, Wilson, Andrew Gordon, Archambeau, Cedric

arXiv.org Artificial IntelligenceFeb-8-2023

Fortuna supports a range of calibration techniques, such as conformal prediction that can be applied to any trained neural network to generate reliable uncertainty estimates, and scalable Bayesian inference methods that can be applied to Flax-based deep neural networks trained from scratch for improved uncertainty quantification and accuracy. By providing a coherent framework for advanced uncertainty quantification methods, Fortuna simplifies the process of benchmarking and helps practitioners build robust AI systems.

artificial intelligence, deep learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2302.04019

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Amazon SageMaker Model Monitor: A System for Real-Time Insights into Deployed Machine Learning Models

Nigenda, David, Karnin, Zohar, Zafar, Muhammad Bilal, Ramesha, Raghu, Tan, Alan, Donini, Michele, Kenthapadi, Krishnaram

arXiv.org Artificial IntelligenceDec-13-2021

With the increasing adoption of machine learning (ML) models and systems in high-stakes settings across different industries, guaranteeing a model's performance after deployment has become crucial. Monitoring models in production is a critical aspect of ensuring their continued performance and reliability. We present Amazon SageMaker Model Monitor, a fully managed service that continuously monitors the quality of machine learning models hosted on Amazon SageMaker. Our system automatically detects data, concept, bias, and feature attribution drift in models in real-time and provides alerts so that model owners can take corrective actions and thereby maintain high quality models. We describe the key requirements obtained from customers, system design and architecture, and methodology for detecting different types of drift. Further, we provide quantitative evaluations followed by use cases, insights, and lessons learned from more than 1.5 years of production deployment.

machine learning, model monitor, real estate, (20 more...)

arXiv.org Artificial Intelligence

2111.13657

Country: North America (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Services (0.68)
Banking & Finance > Real Estate (0.46)
Banking & Finance > Loans (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

Add feedback

Amazon SageMaker Automatic Model Tuning: Scalable Black-box Optimization

Perrone, Valerio, Shen, Huibin, Zolic, Aida, Shcherbatyi, Iaroslav, Ahmed, Amr, Bansal, Tanya, Donini, Michele, Winkelmolen, Fela, Jenatton, Rodolphe, Faddoul, Jean Baptiste, Pogorzelska, Barbara, Miladinovic, Miroslav, Kenthapadi, Krishnaram, Seeger, Matthias, Archambeau, Cédric

arXiv.org Machine LearningDec-15-2020

Tuning complex machine learning systems is challenging. Machine learning models typically expose a set of hyperparameters, be it regularization, architecture, or optimization parameters, whose careful tuning is critical to achieve good performance. To democratize access to such systems, it is essential to automate this tuning process. This paper presents Amazon SageMaker Automatic Model Tuning (AMT), a fully managed system for black-box optimization at scale. AMT finds the best version of a machine learning model by repeatedly training it with different hyperparameter configurations. It leverages either random search or Bayesian optimization to choose the hyperparameter values resulting in the best-performing model, as measured by the metric chosen by the user. AMT can be used with built-in algorithms, custom algorithms, and Amazon SageMaker pre-built containers for machine learning frameworks. We discuss the core functionality, system architecture and our design principles. We also describe some more advanced features provided by AMT, such as automated early stopping and warm-starting, demonstrating their benefits in experiments.

air transportation, deep learning, optimization, (17 more...)

arXiv.org Machine Learning

2012.08489

Country:

North America > Canada > Alberta (0.14)
Europe (0.14)

Genre: Research Report (0.64)

Industry: Transportation > Air (0.62)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Fair Bayesian Optimization

Perrone, Valerio, Donini, Michele, Kenthapadi, Krishnaram, Archambeau, Cédric

arXiv.org Machine LearningJun-9-2020

Given the increasing importance of machine learning (ML) in our lives, algorithmic fairness techniques have been proposed to mitigate biases that can be amplified by ML. Commonly, these specialized techniques apply to a single family of ML models and a specific definition of fairness, limiting their effectiveness in practice. We introduce a general constrained Bayesian optimization (BO) framework to optimize the performance of any ML model while enforcing one or multiple fairness constraints. BO is a global optimization method that has been successfully applied to automatically tune the hyperparameters of ML models. We apply BO with fairness constraints to a range of popular models, including random forests, gradient boosting, and neural networks, showing that we can obtain accurate and fair solutions by acting solely on the hyperparameters. We also show empirically that our approach is competitive with specialized techniques that explicitly enforce fairness constraints during training, and outperforms preprocessing methods that learn unbiased representations of the input data. Moreover, our method can be used in synergy with such specialized fairness techniques to tune their hyperparameters. Finally, we study the relationship between hyperparameters and fairness of the generated model. We observe a correlation between regularization and unbiased models, explaining why acting on the hyperparameters leads to ML models that generalize well and are fair.

artificial intelligence, hyperparameter, optimization problem, (18 more...)

arXiv.org Machine Learning

2006.05109

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Empirical Risk Minimization Under Fairness Constraints

Donini, Michele, Oneto, Luca, Ben-David, Shai, Shawe-Taylor, John S., Pontil, Massimiliano

Neural Information Processing SystemsFeb-14-2020, 11:27:54 GMT

We address the problem of algorithmic fairness: ensuring that sensitive information does not unfairly influence the outcome of a classifier. We present an approach based on empirical risk minimization, which incorporates a fairness constraint into the learning problem. It encourages the conditional risk of the learned classifier to be approximately constant with respect to the sensitive variable. We derive both risk and fairness bounds that support the statistical consistency of our methodology. We specify our approach to kernel methods and observe that the fairness requirement implies an orthogonality constraint which can be easily added to these methods.

artificial intelligence, empirical risk minimization, machine learning, (1 more...)

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback