AITopics | Colin, Igor

Collaborating Authors

Colin, Igor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Differentially Private Policy Gradient

Rio, Alexandre, Barlier, Merwan, Colin, Igor

arXiv.org Artificial IntelligenceJan-31-2025

Motivated by the increasing deployment of reinforcement learning in the real world, involving a large consumption of personal data, we introduce a differentially private (DP) policy gradient algorithm. We show that, in this setting, the introduction of Differential Privacy can be reduced to the computation of appropriate trust regions, thus avoiding the sacrifice of theoretical properties of the DP-less methods. Therefore, we show that it is possible to find the right trade-off between privacy noise and trust-region size to obtain a performant differentially private policy gradient algorithm. We then outline its performance empirically on various benchmarks. Our results and the complexity of the tasks addressed represent a significant improvement over existing DP algorithms in online RL.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2501.1908

Country:

Europe (1.00)
North America > United States > New York (0.14)
North America > United States > Colorado (0.14)
North America > Canada > British Columbia (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.46)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Differentially Private Model-Based Offline Reinforcement Learning

Rio, Alexandre, Barlier, Merwan, Colin, Igor, Thomas, Albert

arXiv.org Artificial IntelligenceFeb-8-2024

We address offline reinforcement learning with privacy guarantees, where the goal is to train a policy that is differentially private with respect to individual trajectories in the dataset. To achieve this, we introduce DP-MORL, an MBRL algorithm coming with differential privacy guarantees. A private model of the environment is first learned from offline data using DP-FedAvg, a training method for neural networks that provides differential privacy guarantees at the trajectory level. Then, we use model-based policy optimization to derive a policy from the (penalized) private model, without any further interaction with the system or access to the input data. We empirically show that DP-MORL enables the training of private RL agents from offline data and we furthermore outline the price of privacy in this setting.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2402.05525

Country:

Europe > France (0.14)
Europe > Finland (0.14)
Asia > China (0.14)

Genre: Research Report (0.65)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Clustered Multi-Agent Linear Bandits

Cherkaoui, Hamza, Barlier, Merwan, Colin, Igor

arXiv.org Machine LearningOct-30-2023

We address in this paper a particular instance of the multi-agent linear stochastic bandit problem, called clustered multi-agent linear bandits. In this setting, we propose a novel algorithm leveraging an efficient collaboration between the agents in order to accelerate the overall optimization problem. In this contribution, a network controller is responsible for estimating the underlying cluster structure of the network and optimizing the experiences sharing among agents within the same groups. We provide a theoretical analysis for both the regret minimization problem and the clustering quality. Through empirical evaluation against state-of-the-art algorithms on both synthetic and real data, we demonstrate the effectiveness of our approach: our algorithm significantly improves regret minimization while managing to recover the true underlying cluster partitioning.

agent, algorithm, artificial intelligence, (16 more...)

arXiv.org Machine Learning

2309.0871

Country:

North America > United States (0.69)
Europe (0.67)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Price of Safety in Linear Best Arm Identification

Shang, Xuedong, Colin, Igor, Barlier, Merwan, Cherkaoui, Hamza

arXiv.org Machine LearningSep-15-2023

We introduce the safe best-arm identification framework with linear feedback, where the agent is subject to some stage-wise safety constraint that linearly depends on an unknown parameter vector. The agent must take actions in a conservative way so as to ensure that the safety constraint is not violated with high probability at each round. Ways of leveraging the linear structure for ensuring safety has been studied for regret minimization, but not for best-arm identification to the best our knowledge. We propose a gap-based algorithm that achieves meaningful sample complexity while ensuring the stage-wise safety. We show that we pay an extra term in the sample complexity due to the forced exploration phase incurred by the additional safety constraint. Experimental illustrations are provided to justify the design of our algorithm.

bandit, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2309.08709

Genre: Research Report (0.82)

Industry: Telecommunications (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Theoretical Limits of Pipeline Parallel Optimization and Application to Distributed Deep Learning

Colin, Igor, Santos, Ludovic Dos, Scaman, Kevin

arXiv.org Machine LearningOct-11-2019

We investigate the theoretical limits of pipeline parallel learning of deep learning architectures, a distributed setup in which the computation is distributed per layer instead of per example. For smooth convex and non-convex objective functions, we provide matching lower and upper complexity bounds and show that a naive pipeline parallelization of Nesterov's accelerated gradient descent is optimal. For non-smooth convex functions, we provide a novel algorithm coined Pipeline Parallel Random Smoothing (PPRS) that is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension. While the convergence rate still obeys a slow $\varepsilon^{-2}$ convergence rate, the depth-dependent part is accelerated, resulting in a near-linear speed-up and convergence time that only slightly depends on the depth of the deep learning architecture. Finally, we perform an empirical analysis of the non-smooth non-convex case and show that, for difficult and highly non-smooth problems, PPRS outperforms more traditional optimization algorithms such as gradient descent and Nesterov's accelerated gradient descent for problems where the sample size is limited, such as few-shot or adversarial learning.

convergence rate, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1910.05104

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Parallel Contextual Bandits in Wireless Handover Optimization

Colin, Igor, Thomas, Albert, Draief, Moez

arXiv.org Machine LearningJan-21-2019

Abstract--As cellular networks become denser, a scalable and dynamic tuning of wireless base station parameters can only be achieved through automated optimization. Although the contextual banditframework arises as a natural candidate for such a task, its extension to a parallel setting is not straightforward: one needs to carefully adapt existing methods to fully leverage the multi-agent structure of this problem. We propose two approaches: one derived from a deterministic UCB-like method and the other relying on Thompson sampling. Thanks to its bayesian nature, the latter is intuited to better preserve the exploration-exploitation balance in the bandit batch. This is verified on toy experiments, where Thompson sampling shows robustness to the variability of the contexts. Finally, we apply both methods on a real base station network dataset and evidence that Thompson sampling outperforms both manual tuning and contextual UCB. I. INTRODUCTION The land area covered by a cellular wireless network, such as a mobile phone network, is divided into small areas called cells, each cell being covered by the antenna of a fixed base station (see Figure 1).

bandit, télécommunications, upstream oil & gas, (20 more...)

arXiv.org Machine Learning

1902.01931

Genre: Research Report (0.64)

Industry:

Telecommunications (1.00)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Communications > Networks (0.66)

Add feedback

Decentralized Topic Modelling with Latent Dirichlet Allocation

Colin, Igor, Dupuy, Christophe

arXiv.org Machine LearningOct-5-2016

Privacy preserving networks can be modelled as decentralized networks (e.g., sensors, connected objects, smartphones), where communication between nodes of the network is not controlled by an all-knowing, central node. For this type of networks, the main issue is to gather/learn global information on the network (e.g., by optimizing a global cost function) while keeping the (sensitive) information at each node. In this work, we focus on text information that agents do not want to share (e.g., text messages, emails, confidential reports). We use recent advances on decentralized optimization and topic models to infer topics from a graph with limited communication. We propose a method to adapt latent Dirichlet allocation (LDA) model to decentralized optimization and show on synthetic data that we still recover similar parameters and similar performance at each node than with stochastic methods accessing to the whole information in the graph.

artificial intelligence, optimization, text processing, (16 more...)

arXiv.org Machine Learning

1610.01417

Country: Europe > Spain (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.63)

Add feedback

Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions

Colin, Igor, Bellet, Aurélien, Salmon, Joseph, Clémençon, Stéphan

arXiv.org Machine LearningJun-8-2016

In decentralized networks (of sensors, connected objects, etc.), there is an important need for efficient algorithms to optimize a global cost function, for instance to learn a global model from the local data collected by each computing unit. In this paper, we address the problem of decentralized minimization of pairwise functions of the data points, where these points are distributed over the nodes of a graph defining the communication topology of the network. This general problem finds applications in ranking, distance metric learning and graph inference, among others. We propose new gossip algorithms based on dual averaging which aims at solving such problems both in synchronous and asynchronous settings. The proposed framework is flexible enough to deal with constrained and regularized variants of the optimization problem. Our theoretical analysis reveals that the proposed algorithms preserve the convergence rate of centralized dual averaging up to an additive bias term. We present numerical simulations on Area Under the ROC Curve (AUC) maximization and metric learning problems which illustrate the practical interest of our approach.

algorithm, artificial intelligence, optimization problem, (16 more...)

arXiv.org Machine Learning

1606.02421

Country:

North America > United States (0.46)
Europe > France (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Add feedback

Scaling-up Empirical Risk Minimization: Optimization of Incomplete U-statistics

Clémençon, Stéphan, Bellet, Aurélien, Colin, Igor

arXiv.org Machine LearningApr-19-2016

In a wide range of statistical learning problems such as ranking, clustering or metric learning among others, the risk is accurately estimated by $U$-statistics of degree $d\geq 1$, i.e. functionals of the training data with low variance that take the form of averages over $k$-tuples. From a computational perspective, the calculation of such statistics is highly expensive even for a moderate sample size $n$, as it requires averaging $O(n^d)$ terms. This makes learning procedures relying on the optimization of such data functionals hardly feasible in practice. It is the major goal of this paper to show that, strikingly, such empirical risks can be replaced by drastically computationally simpler Monte-Carlo estimates based on $O(n)$ terms only, usually referred to as incomplete $U$-statistics, without damaging the $O_{\mathbb{P}}(1/\sqrt{n})$ learning rate of Empirical Risk Minimization (ERM) procedures. For this purpose, we establish uniform deviation results describing the error made when approximating a $U$-process by its incomplete version under appropriate complexity assumptions. Extensions to model selection, fast rate situations and various sampling techniques are also considered, as well as an application to stochastic gradient descent for ERM. Finally, numerical examples are displayed in order to provide strong empirical evidence that the approach we promote largely surpasses more naive subsampling techniques.

artificial intelligence, survey article, u-statistic, (17 more...)

arXiv.org Machine Learning

1501.02629

Country:

North America > United States (0.14)
Europe > France (0.14)

Genre:

Overview (0.92)
Research Report > New Finding (0.46)

Industry: Education (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Extending Gossip Algorithms to Distributed Estimation of U-statistics

Colin, Igor, Bellet, Aurélien, Salmon, Joseph, Clémençon, Stéphan

Neural Information Processing SystemsDec-31-2015

Efficient and robust algorithms for decentralized estimation in networks are essential to many distributed systems. Whereas distributed estimation of sample mean statistics has been the subject of a good deal of attention, computation of U-statistics, relying on more expensive averaging over pairs of observations, is a less investigated area. Yet, such data functionals are essential to describe global properties of a statistical population, with important examples including Area Under the Curve, empirical variance, Gini mean difference and within-cluster point scatter. This paper proposes new synchronous and asynchronous randomized gossip algorithms which simultaneously propagate data across the network and maintain local estimates of the U-statistic of interest. We establish convergence rate bounds of O(1 / t) and O(log t / t) for the synchronous and asynchronous cases respectively, where t is the number of iterations, with explicit data and network dependent terms. Beyond favorable comparisons in terms of rate analysis, numerical experiments provide empirical evidence the proposed algorithms surpasses the previously introduced approach.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.68)

Add feedback