AITopics | Rini, Stefano

Collaborating Authors

Rini, Stefano

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

PAUSE: Low-Latency and Privacy-Aware Active User Selection for Federated Learning

Peleg, Ori, Lang, Natalie, Rini, Stefano, Shlezinger, Nir, Cohen, Kobi

arXiv.org Artificial IntelligenceMar-17-2025

Federated learning (FL) enables multiple edge devices to collaboratively train a machine learning model without the need to share potentially private data. Federated learning proceeds through iterative exchanges of model updates, which pose two key challenges: First, the accumulation of privacy leakage over time, and second, communication latency. These two limitations are typically addressed separately: The former via perturbed updates to enhance privacy and the latter using user selection to mitigate latency - both at the expense of accuracy. In this work, we propose a method that jointly addresses the accumulation of privacy leakage and communication latency via active user selection, aiming to improve the trade-off among privacy, latency, and model performance. To achieve this, we construct a reward function that accounts for these three objectives. Building on this reward, we propose a multi-armed bandit (MAB)-based algorithm, termed Privacy-aware Active User SElection (PAUSE) which dynamically selects a subset of users each round while ensuring bounded overall privacy leakage. We establish a theoretical analysis, systematically showing that the reward growth rate of PAUSE follows that of the best-known rate in MAB literature. To address the complexity overhead of active user selection, we propose a simulated annealing-based relaxation of PAUSE and analyze its ability to approximate the reward-maximizing policy under reduced complexity. We numerically validate the privacy leakage, associated improved latency, and accuracy gains of our methods for the federated training in various scenarios.

artificial intelligence, machine learning, pause, (17 more...)

arXiv.org Artificial Intelligence

2503.13173

Country:

North America > United States (0.28)
Asia > Middle East > Israel (0.28)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

The Query/Hit Model for Sequential Hypothesis Testing

Shariatnasab, Mahshad, Rini, Stefano, Shirani, Farhad, Iyengar, S. Sitharama

arXiv.org Artificial IntelligenceFeb-1-2025

This work introduces the Query/Hit (Q/H) learning model. The setup consists of two agents. One agent, Alice, has access to a streaming source, while the other, Bob, does not have direct access to the source. Communication occurs through sequential Q/H pairs: Bob sends a sequence of source symbols (queries), and Alice responds with the waiting time until each query appears in the source stream (hits). This model is motivated by scenarios with communication, computation, and privacy constraints that limit real-time access to the source. The error exponent for sequential hypothesis testing under the Q/H model is characterized, and a querying strategy, the Dynamic Scout-Sentinel Algorithm (DSSA), is proposed. The strategy employs a mutual information neural estimator to compute the error exponent associated with each query and to select the query with the highest efficiency. Extensive empirical evaluations on both synthetic and real-world datasets -- including mouse movement trajectories, typesetting patterns, and touch-based user interactions -- are provided to evaluate the performance of the proposed strategy in comparison with baselines, in terms of probability of error, query choice, and time-to-detection.

artificial intelligence, machine learning, query, (19 more...)

arXiv.org Artificial Intelligence

2502.00605

Country: North America > United States > Florida (0.14)

Genre: Research Report (0.65)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.62)

Add feedback

Age-of-Gradient Updates for Federated Learning over Random Access Channels

Wu, Yu Heng, Asgari, Houman, Rini, Stefano, Munari, Andrea

arXiv.org Artificial IntelligenceOct-15-2024

This paper studies the problem of federated training of a deep neural network (DNN) over a random access channel (RACH) such as in computer networks, wireless networks, and cellular systems. More precisely, a set of remote users participate in training a centralized DNN model using SGD under the coordination of a parameter server (PS). The local model updates are transmitted from the remote users to the PS over a RACH using a slotted ALOHA protocol. The PS collects the updates from the remote users, accumulates them, and sends central model updates to the users at regular time intervals. We refer to this setting as the RACH-FL setting. The RACH-FL setting crucially addresses the problem of jointly designing a (i) client selection and (ii) gradient compression strategy which addresses the communication constraints between the remote users and the PS when transmission occurs over a RACH. For the RACH-FL setting, we propose a policy, which we term the ''age-of-gradient'' (AoG) policy in which (i) gradient sparsification is performed using top-K sparsification, (ii) the error correction is performed using memory accumulation, and (iii) the slot transmission probability is obtained by comparing the current local memory magnitude minus the magnitude of the gradient update to a threshold. Intuitively, the AoG measure of ''freshness'' of the memory state is reminiscent of the concept of age-of-information (AoI) in the context of communication theory and provides a rather natural interpretation of this policy. Numerical simulations show the superior performance of the AoG policy as compared to other RACH-FL policies.

artificial intelligence, machine learning, transmission probability, (15 more...)

arXiv.org Artificial Intelligence

2410.11986

Country: North America > Canada (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

HAAQI-Net: A non-intrusive neural music quality assessment model for hearing aids

Wisnu, Dyah A. M. G., Pratiwi, Epri W., Rini, Stefano, Zezario, Ryandhimas E., Wang, Hsin-Min, Tsao, Yu

arXiv.org Artificial IntelligenceJan-31-2024

This paper introduces HAAQI-Net, a non-intrusive deep learning model for music quality assessment tailored to hearing aid users. In contrast to traditional methods like the Hearing Aid Audio Quality Index (HAAQI), HAAQI-Net utilizes a Bidirectional Long Short-Term Memory (BLSTM) with attention. It takes an assessed music sample and a hearing loss pattern as input, generating a predicted HAAQI score. The model employs the pre-trained Bidirectional Encoder representation from Audio Transformers (BEATs) for acoustic feature extraction. Comparing predicted scores with ground truth, HAAQI-Net achieves a Longitudinal Concordance Correlation (LCC) of 0.9368, Spearman's Rank Correlation Coefficient (SRCC) of 0.9486, and Mean Squared Error (MSE) of 0.0064. Notably, this high performance comes with a substantial reduction in inference time: from 62.52 seconds (by HAAQI) to 2.54 seconds (by HAAQI-Net), serving as an efficient music quality assessment model for hearing aid users.

artificial intelligence, deep learning, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2401.01145

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Otolaryngology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Empirical Risk Minimization with Relative Entropy Regularization

Perlaza, Samir M., Bisson, Gaetan, Esnaola, Iñaki, Jean-Marie, Alain, Rini, Stefano

arXiv.org Artificial IntelligenceNov-21-2023

The empirical risk minimization (ERM) problem with relative entropy regularization (ERM-RER) is investigated under the assumption that the reference measure is a {\sigma}-finite measure, and not necessarily a probability measure. Under this assumption, which leads to a generalization of the ERM-RER problem allowing a larger degree of flexibility for incorporating prior knowledge, numerous relevant properties are stated. Among these properties, the solution to this problem, if it exists, is shown to be a unique probability measure, often mutually absolutely continuous with the reference measure. Such a solution exhibits a probably-approximately-correct guarantee for the ERM problem independently of whether the latter possesses a solution. For a fixed dataset, the empirical risk is shown to be a sub-Gaussian random variable when the models are sampled from the solution to the ERM-RER problem. The generalization capabilities of the solution to the ERM-RER problem (the Gibbs algorithm) are studied via the sensitivity of the expected empirical risk to deviations from such a solution towards alternative probability measures. Finally, an interesting connection between sensitivity, generalization error, and lautum information is established

artificial intelligence, machine learning, probability measure, (17 more...)

arXiv.org Artificial Intelligence

2211.06617

Country:

Asia (0.92)
North America > United States > New Jersey (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

How to Attain Communication-Efficient DNN Training? Convert, Compress, Correct

Chen, Zhong-Jing, Hernandez, Eduin E., Huang, Yu-Chih, Rini, Stefano

arXiv.org Artificial IntelligenceJun-1-2023

This paper introduces CO3 -- an algorithm for communication-efficient federated Deep Neural Network (DNN) training. CO3 takes its name from three processing applied which reduce the communication load when transmitting the local DNN gradients from the remote users to the Parameter Server. Namely: (i) gradient quantization through floating-point conversion, (ii) lossless compression of the quantized gradient, and (iii) quantization error correction. We carefully design each of the steps above to assure good training performance under a constraint on the communication rate. In particular, in steps (i) and (ii), we adopt the assumption that DNN gradients are distributed according to a generalized normal distribution, which is validated numerically in the paper. For step (iii), we utilize an error feedback with memory decay mechanism to correct the quantization error introduced in step (i). We argue that the memory decay coefficient, similarly to the learning rate, can be optimally tuned to improve convergence. A rigorous convergence analysis of the proposed CO3 with SGD is provided. Moreover, with extensive simulations, we show that CO3 offers improved performance when compared with existing gradient compression schemes in the literature which employ sketching and non-uniform quantization of the local gradients.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2204.08211

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

M22: A Communication-Efficient Algorithm for Federated Learning Inspired by Rate-Distortion

Liu, Yangyi, Rini, Stefano, Salehkalaibar, Sadaf, Chen, Jun

arXiv.org Artificial IntelligenceJan-22-2023

In federated learning (FL), the communication constraint between the remote learners and the Parameter Server (PS) is a crucial bottleneck. For this reason, model updates must be compressed so as to minimize the loss in accuracy resulting from the communication constraint. This paper proposes ``\emph{${\bf M}$-magnitude weighted $L_{\bf 2}$ distortion + $\bf 2$ degrees of freedom''} (M22) algorithm, a rate-distortion inspired approach to gradient compression for federated training of deep neural networks (DNNs). In particular, we propose a family of distortion measures between the original gradient and the reconstruction we referred to as ``$M$-magnitude weighted $L_2$'' distortion, and we assume that gradient updates follow an i.i.d. distribution -- generalized normal or Weibull, which have two degrees of freedom. In both the distortion measure and the gradient, there is one free parameter for each that can be fitted as a function of the iteration number. Given a choice of gradient distribution and distortion measure, we design the quantizer minimizing the expected distortion in gradient reconstruction. To measure the gradient compression performance under a communication constraint, we define the \emph{per-bit accuracy} as the optimal improvement in accuracy that one bit of communication brings to the centralized model over the training period. Using this performance measure, we systematically benchmark the choice of gradient distribution and distortion measure. We provide substantial insights on the role of these choices and argue that significant performance improvements can be attained using such a rate-distortion inspired compressor.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2301.09269

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > Canada > Ontario > Hamilton (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Hierarchical Causal Bandit

Song, Ruiyang, Rini, Stefano, Xu, Kuang

arXiv.org Machine LearningMar-6-2021

Causal bandit is a nascent learning model where an agent sequentially experiments in a causal network of variables, in order to identify the reward-maximizing intervention. Despite the model's wide applicability, existing analytical results are largely restricted to a parallel bandit version where all variables are mutually independent. We introduce in this work the hierarchical causal bandit model as a viable path towards understanding general causal bandits with dependent variables. The core idea is to incorporate a contextual variable that captures the interaction among all variables with direct effects. Using this hierarchical framework, we derive sharp insights on algorithmic design in causal bandits with dependent arms and obtain nearly matching regret bounds in the case of a binary context.

artificial intelligence, big data, intervention, (19 more...)

arXiv.org Machine Learning

2103.04215

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Efficient Federated Learning over Multiple Access Channel with Differential Privacy Constraints

Sonee, Amir, Rini, Stefano

arXiv.org Machine LearningMay-15-2020

In this paper, the problem of federated learning (FL) over a multiple access channel (MAC) is considered. More precisely, we consider the FL setting in which clients are prompted to train a machine learning model by simultaneous communications with a parameter server (PS) with the aim of better utilizing the computational resources available in the network. We also consider the additional constraint in which the communication between the users and the PS is subject to a privacy constraint. To minimize the training loss while also satisfying the privacy rate constraint over the MAC channel, the distributed transmission of digital variants of stochastic gradient descents (D-DSGD) is performed by each client. Additionally, binomial noise is also added at each user to preserve the privacy of the transmission. The optimum levels of quantization in the D-DSGD and the binary noise parameters to achieve efficiency in terms of convergence are investigated, subject to privacy constraint and capacity limit of the MAC channel.

artificial intelligence, gradient, machine learning, (14 more...)

arXiv.org Machine Learning

2005.07776

Country:

Europe > United Kingdom > Scotland (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.51)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

Add feedback