AITopics | Stephan, John

Collaborating Authors

Stephan, John

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Boosting Robustness by Clipping Gradients in Distributed Learning

Allouah, Youssef, Guerraoui, Rachid, Gupta, Nirupam, Jellouli, Ahmed, Rizk, Geovani, Stephan, John

arXiv.org Artificial IntelligenceMay-27-2024

Robust distributed learning consists in achieving good learning performance despite the presence of misbehaving workers. State-of-the-art (SOTA) robust distributed gradient descent (Robust-DGD) methods, relying on robust aggregation, have been proven to be optimal: Their learning error matches the lower bound established under the standard heterogeneity model of $(G, B)$-gradient dissimilarity. The learning guarantee of SOTA Robust-DGD cannot be further improved when model initialization is done arbitrarily. However, we show that it is possible to circumvent the lower bound, and improve the learning performance, when the workers' gradients at model initialization are assumed to be bounded. We prove this by proposing pre-aggregation clipping of workers' gradients, using a novel scheme called adaptive robust clipping (ARC). Incorporating ARC in Robust-DGD provably improves the learning, under the aforementioned assumption on model initialization. The factor of improvement is prominent when the tolerable fraction of misbehaving workers approaches the breakdown point. ARC induces this improvement by constricting the search space, while preserving the robustness property of the original aggregation scheme at the same time. We validate this theoretical finding through exhaustive experiments on benchmark image classification tasks.

artificial intelligence, heterogeneity, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2405.14432

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Overcoming the Challenges of Batch Normalization in Federated Learning

Guerraoui, Rachid, Pinot, Rafael, Rizk, Geovani, Stephan, John, Taiani, François

arXiv.org Artificial IntelligenceMay-23-2024

Batch normalization has proven to be a very beneficial mechanism to accelerate the training and improve the accuracy of deep neural networks in centralized environments. Yet, the scheme faces significant challenges in federated learning, especially under high data heterogeneity. Essentially, the main challenges arise from external covariate shifts and inconsistent statistics across clients. We introduce in this paper Federated BatchNorm (FBN), a novel scheme that restores the benefits of batch normalization in federated learning. Essentially, FBN ensures that the batch normalization during training is consistent with what would be achieved in a centralized execution, hence preserving the distribution of the data, and providing running statistics that accurately approximate the global statistics. FBN thereby reduces the external covariate shift and matches the evaluation performance of the centralized setting. We also show that, with a slight increase in complexity, we can robustify FBN to mitigate erroneous statistics and potentially adversarial attacks.

artificial intelligence, batchnorm, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2405.1467

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Can Machines Learn Robustly, Privately, and Efficiently?

Allouah, Youssef, Guerraoui, Rachid, Stephan, John

arXiv.org Artificial IntelligenceDec-22-2023

The success of machine learning (ML) applications relies on vast datasets and distributed architectures, which, as they grow, present challenges for ML. In real-world scenarios, where data often contains sensitive information, issues like data poisoning and hardware failures are common. Ensuring privacy and robustness is vital for the broad adoption of ML in public life. This paper examines the costs associated with achieving these objectives in distributed architectures. We overview the meanings of privacy and robustness in distributed ML, and clarify how they can be achieved efficiently in isolation. However, we contend that the integration of these objectives entails a notable compromise in computational efficiency. We delve into this intricate balance, exploring the challenges and solutions for privacy, robustness, and computational efficiency in ML applications.

artificial intelligence, machine learning, robustness, (16 more...)

arXiv.org Artificial Intelligence

2312.14712

Country:

Europe (0.28)
North America > United States > Texas (0.14)

Genre: Research Report (0.84)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

SABLE: Secure And Byzantine robust LEarning

Choffrut, Antoine, Guerraoui, Rachid, Pinot, Rafael, Sirdey, Renaud, Stephan, John, Zuber, Martin

arXiv.org Artificial IntelligenceDec-14-2023

Due to the widespread availability of data, machine learning (ML) algorithms are increasingly being implemented in distributed topologies, wherein various nodes collaborate to train ML models via the coordination of a central server. However, distributed learning approaches face significant vulnerabilities, primarily stemming from two potential threats. Firstly, the presence of Byzantine nodes poses a risk of corrupting the learning process by transmitting inaccurate information to the server. Secondly, a curious server may compromise the privacy of individual nodes, sometimes reconstructing the entirety of the nodes' data. Homomorphic encryption (HE) has emerged as a leading security measure to preserve privacy in distributed learning under non-Byzantine scenarios. However, the extensive computational demands of HE, particularly for high-dimensional ML models, have deterred attempts to design purely homomorphic operators for non-linear robust aggregators. This paper introduces SABLE, the first homomorphic and Byzantine robust distributed learning algorithm. SABLE leverages HTS, a novel and efficient homomorphic operator implementing the prominent coordinate-wise trimmed mean robust aggregator. Designing HTS enables us to implement HMED, a novel homomorphic median aggregator. Extensive experiments on standard ML tasks demonstrate that SABLE achieves practical execution times while maintaining an ML accuracy comparable to its non-private counterpart.

artificial intelligence, machine learning, node, (17 more...)

arXiv.org Artificial Intelligence

2309.05395

Country:

Europe (1.00)
Asia (0.67)
North America > United States > New York > New York County > New York City (0.14)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Combining Differential Privacy and Byzantine Resilience in Distributed SGD

Guerraoui, Rachid, Gupta, Nirupam, Pinot, Rafael, Rouault, Sebastien, Stephan, John

arXiv.org Artificial IntelligenceOct-5-2023

Privacy and Byzantine resilience (BR) are two crucial requirements of modern-day distributed machine learning. The two concepts have been extensively studied individually but the question of how to combine them effectively remains unanswered. This paper contributes to addressing this question by studying the extent to which the distributed SGD algorithm, in the standard parameter-server architecture, can learn an accurate model despite (a) a fraction of the workers being malicious (Byzantine), and (b) the other fraction, whilst being honest, providing noisy information to the server to ensure differential privacy (DP). We first observe that the integration of standard practices in DP and BR is not straightforward. In fact, we show that many existing results on the convergence of distributed SGD under Byzantine faults, especially those relying on $(\alpha,f)$-Byzantine resilience, are rendered invalid when honest workers enforce DP. To circumvent this shortcoming, we revisit the theory of $(\alpha,f)$-BR to obtain an approximate convergence guarantee. Our analysis provides key insights on how to improve this guarantee through hyperparameter optimization. Essentially, our theoretical and empirical results show that (1) an imprudent combination of standard approaches to DP and BR might be fruitless, but (2) by carefully re-tuning the learning algorithm, we can obtain reasonable learning accuracy while simultaneously guaranteeing DP and BR.

artificial intelligence, machine learning, vn condition, (17 more...)

arXiv.org Artificial Intelligence

2110.03991

Country:

Europe (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Robust Collaborative Learning with Linear Gradient Overhead

Farhadkhani, Sadegh, Guerraoui, Rachid, Gupta, Nirupam, Hoang, Lê Nguyên, Pinot, Rafael, Stephan, John

arXiv.org Artificial IntelligenceJun-3-2023

Collaborative learning algorithms, such as distributed SGD (or D-SGD), are prone to faulty machines that may deviate from their prescribed algorithm because of software or hardware bugs, poisoned data or malicious behaviors. While many solutions have been proposed to enhance the robustness of D-SGD to such machines, previous works either resort to strong assumptions (trusted server, homogeneous data, specific noise model) or impose a gradient computational cost that is several orders of magnitude higher than that of D-SGD. We present MoNNA, a new algorithm that (a) is provably robust under standard assumptions and (b) has a gradient computation overhead that is linear in the fraction of faulty machines, which is conjectured to be tight. Essentially, MoNNA uses Polyak's momentum of local gradients for local updates and nearest-neighbor averaging (NNA) for global mixing, respectively. While MoNNA is rather simple to implement, its analysis has been more challenging and relies on two key elements that may be of independent interest. Specifically, we introduce the mixing criterion of $(\alpha, \lambda)$-reduction to analyze the non-linear mixing of non-faulty machines, and present a way to control the tension between the momentum and the model drifts. We validate our theory by experiments on image classification and make our code available at https://github.com/LPD-EPFL/robust-collaborative-learning.

artificial intelligence, machine learning, null null null 2, (10 more...)

arXiv.org Artificial Intelligence

2209.10931

Country:

North America > United States > Hawaii (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Collaboration (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

On the Privacy-Robustness-Utility Trilemma in Distributed Learning

Allouah, Youssef, Guerraoui, Rachid, Gupta, Nirupam, Pinot, Rafael, Stephan, John

arXiv.org Artificial IntelligenceMay-29-2023

The ubiquity of distributed machine learning (ML) in sensitive public domain applications calls for algorithms that protect data privacy, while being robust to faults and adversarial behaviors. Although privacy and robustness have been extensively studied independently in distributed ML, their synthesis remains poorly understood. We present the first tight analysis of the error incurred by any algorithm ensuring robustness against a fraction of adversarial machines, as well as differential privacy (DP) for honest machines' data against any other curious entity. Our analysis exhibits a fundamental trade-off between privacy, robustness, and utility. To prove our lower bound, we consider the case of mean estimation, subject to distributed DP and robustness constraints, and devise reductions to centralized estimation of one-way marginals. We prove our matching upper bound by presenting a new distributed ML algorithm using a high-dimensional robust aggregation rule. The latter amortizes the dependence on the dimension in the error (caused by adversarial workers and DP), while being agnostic to the statistical properties of the data.

artificial intelligence, machine learning, privacy-robustness-utility trilemma, (15 more...)

arXiv.org Artificial Intelligence

2302.04787

Country:

Asia (0.92)
North America > United States > California (0.27)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

On the Impossible Safety of Large AI Models

El-Mhamdi, El-Mahdi, Farhadkhani, Sadegh, Guerraoui, Rachid, Gupta, Nirupam, Hoang, Lê-Nguyên, Pinot, Rafael, Rouault, Sébastien, Stephan, John

arXiv.org Artificial IntelligenceMay-9-2023

Large AI Models (LAIMs), of which large language models are the most prominent recent example, showcase some impressive performance. However they have been empirically found to pose serious security issues. This paper systematizes our knowledge about the fundamental impossibility of building arbitrarily accurate and secure machine learning models. More precisely, we identify key challenging features of many of today's machine learning settings. Namely, high accuracy seems to require memorizing large training datasets, which are often user-generated and highly heterogeneous, with both sensitive information and fake users. We then survey statistical lower bounds that, we argue, constitute a compelling case against the possibility of designing high-accuracy LAIMs with strong security guarantees.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2209.15259

Country:

Europe (1.00)
Asia (1.00)
North America > Canada > Ontario > Toronto (0.14)
(2 more...)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fixing by Mixing: A Recipe for Optimal Byzantine ML under Heterogeneity

Allouah, Youssef, Farhadkhani, Sadegh, Guerraoui, Rachid, Gupta, Nirupam, Pinot, Rafael, Stephan, John

arXiv.org Artificial IntelligenceFeb-3-2023

Byzantine machine learning (ML) aims to ensure the resilience of distributed learning algorithms to misbehaving (or Byzantine) machines. Although this problem received significant attention, prior works often assume the data held by the machines to be homogeneous, which is seldom true in practical settings. Data heterogeneity makes Byzantine ML considerably more challenging, since a Byzantine machine can hardly be distinguished from a non-Byzantine outlier. A few solutions have been proposed to tackle this issue, but these provide suboptimal probabilistic guarantees and fare poorly in practice. This paper closes the theoretical gap, achieving optimality and inducing good empirical results. In fact, we show how to automatically adapt existing solutions for (homogeneous) Byzantine ML to the heterogeneous setting through a powerful mechanism, we call nearest neighbor mixing (NNM), which boosts any standard robust distributed gradient descent variant to yield optimal Byzantine resilience under heterogeneity. We obtain similar guarantees (in expectation) by plugging NNM in the distributed stochastic heavy ball method, a practical substitute to distributed gradient descent. We obtain empirical results that significantly outperform state-of-the-art Byzantine ML solutions.

artificial intelligence, machine learning, step number step number 1, (12 more...)

arXiv.org Artificial Intelligence

2302.01772

Country:

Europe (0.45)
North America > United States (0.27)
Asia > Middle East (0.27)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback