AITopics | Patrini, Giorgio

Collaborating Authors

Patrini, Giorgio

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SEALion: a Framework for Neural Network Inference on Encrypted Data

van Elsloo, Tim, Patrini, Giorgio, Ivey-Law, Hamish

arXiv.org Machine LearningApr-29-2019

We present SEALion: an extensible framework for privacy-preserving machine learning with homomorphic encryption. It allows one to learn deep neural networks that can be seamlessly utilized for prediction on encrypted data. The framework consists of two layers: the first is built upon TensorFlow and SEAL and exposes standard algebra and deep learning primitives; the second implements a Keras-like syntax for training and inference with neural networks. Given a required level of security, a user is abstracted from the details of the encoding and the encryption scheme, allowing quick prototyping. We present two applications that exemplifying the extensibility of our proposal, which are also of independent interest: i) improving efficiency of neural network inference by an activity sparsifier and ii) transfer learning by querying a server-side Variational AutoEncoder that can handle encrypted data.

deep learning, encrypted data, neural network, (22 more...)

arXiv.org Machine Learning

1904.1284

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Three Tools for Practical Differential Privacy

van der Veen, Koen Lennart, Seggers, Ruben, Bloem, Peter, Patrini, Giorgio

arXiv.org Machine LearningDec-6-2018

Differentially private learning on real-world data poses challenges for standard machine learning practice: privacy guarantees are difficult to interpret, hyperparameter tuning on private data reduces the privacy budget, and ad-hoc privacy attacks are often required to test model privacy. We introduce three tools to make differentially private machine learning more practical: (1) simple sanity checks which can be carried out in a centralized manner before training, (2) an adaptive clipping bound to reduce the effective number of tuneable privacy parameters, and (3) we show that large-batch training improves model performance.

artificial intelligence, differential privacy, neural network, (17 more...)

arXiv.org Machine Learning

1812.0289

Country:

Europe > Netherlands (0.17)
North America > Canada (0.14)
Europe > Italy (0.14)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

Sinkhorn AutoEncoders

Patrini, Giorgio, Carioni, Marcello, Forré, Patrick, Bhargav, Samarth, Welling, Max, Berg, Rianne van den, Genewein, Tim, Nielsen, Frank

arXiv.org Machine LearningOct-3-2018

Optimal Transport offers an alternative to maximum likelihood for learning generative autoencoding models. We show how this principle dictates the minimization of the Wasserstein distance between the encoder aggregated posterior and the prior, plus a reconstruction error. We prove that in the non-parametric limit the autoencoder generates the data distribution if and only if the two distributions match exactly, and that the optimum can be obtained by deterministic autoencoders. We then introduce the Sinkhorn AutoEncoder (SAE), which casts the problem into Optimal Transport on the latent space. The resulting Wasserstein distance is minimized by backpropagating through the Sinkhorn algorithm. SAE models the aggregated posterior as an implicit distribution and therefore does not need a reparameterization trick for gradients estimation. Moreover, it requires virtually no adaptation to different prior distributions. We demonstrate its flexibility by considering models with hyperspherical and Dirichlet priors, as well as a simple case of probabilistic programming. SAE matches or outperforms other autoencoding models in visual quality and FID scores.

bayesian inference, neural network, wasserstein distance, (19 more...)

arXiv.org Machine Learning

1810.01118

Country: Europe (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)

Add feedback

Making Deep Neural Networks Robust to Label Noise: a Loss Correction Approach

Patrini, Giorgio, Rozza, Alessandro, Menon, Aditya, Nock, Richard, Qu, Lizhen

arXiv.org Machine LearningMar-22-2017

We present a theoretically grounded approach to train deep neural networks, including recurrent networks, subject to class-dependent label noise. We propose two procedures for loss correction that are agnostic to both application domain and network architecture. They simply amount to at most a matrix inversion and multiplication, provided that we know the probability of each class being corrupted into another. We further show how one can estimate these probabilities, adapting a recent technique for noise estimation to the multi-class setting, and thus providing an end-to-end framework. Extensive experiments on MNIST, IMDB, CIFAR-10, CIFAR-100 and a large scale dataset of clothing images employing a diversity of architectures --- stacking dense, convolutional, pooling, dropout, batch normalization, word embedding, LSTM and residual layers --- demonstrate the noise robustness of our proposals. Incidentally, we also prove that, when ReLU is the only non-linearity, the loss curvature is immune to class-dependent label noise.

correction, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1609.03683

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Crossover Process: Learnability and Data Protection from Inference Attacks

Nock, Richard, Patrini, Giorgio, Lattimore, Finnian, Caetano, Tiberio

arXiv.org Machine LearningMar-7-2017

It is usual to consider data protection and learnability as conflicting objectives. This is not always the case: we show how to jointly control inference --- seen as the attack --- and learnability by a noise-free process that mixes training examples, the Crossover Process (cp). One key point is that the cp~is typically able to alter joint distributions without touching on marginals, nor altering the sufficient statistic for the class. In other words, it saves (and sometimes improves) generalization for supervised learning, but can alter the relationship between covariates --- and therefore fool measures of nonlinear independence and causal inference into misleading ad-hoc conclusions. For example, a cp~can increase / decrease odds ratios, bring fairness or break fairness, tamper with disparate impact, strengthen, weaken or reverse causal directions, change observed statistical measures of dependence. For each of these, we quantify changes brought by a cp, as well as its statistical impact on generalization abilities via a new complexity measure that we call the Rademacher cp~complexity. Experiments on a dozen readily available domains validate the theory.

health & medicine, inductive learning, permutation, (18 more...)

arXiv.org Machine Learning

1606.0416

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Tsallis Regularized Optimal Transport and Ecological Inference

Muzellec, Boris (Ecole Polytechnique) | Nock, Richard (Data61, The Australian National University, and The University of Sydney) | Patrini, Giorgio (The Australian National University and Data61) | Nielsen, Frank (Ecole Polytechnique and Sony CS Labs, Inc.)

AAAI ConferencesFeb-14-2017

Optimal transport is a powerful framework for computing distances between probability distributions. We unify the two main approaches to optimal transport, namely Monge-Kantorovitch and Sinkhorn-Cuturi, into what we define as Tsallis regularized optimal transport (TROT). TROT interpolates a rich family of distortions from Wasserstein to Kullback-Leibler, encompassing as well Pearson, Neyman and Hellinger divergences, to name a few. We show that metric properties known for Sinkhorn-Cuturi generalize to TROT, and provide efficient algorithms for finding the optimal transportation plan with formal convergence proofs. We also present the first application of optimal transport to the problem of ecological inference, that is, the reconstruction of joint distributions from their marginals, a problem of large interest in the social sciences. TROT provides a convenient framework for ecological inference by allowing to compute the joint distribution -— that is, the optimal transportation plan itself — when side information is available, which is e.g. typically what census represents in political science. Experiments on data from the 2012 US presidential elections display the potential of TROT in delivering a faithful reconstruction of the joint distribution of ethnic groups and voter preferences.

artificial intelligence, machine learning, optimal transport, (19 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country: North America > United States (0.34)

Industry: Government > Voting & Elections (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Loss factorization, weakly supervised learning and label noise robustness

Patrini, Giorgio, Nielsen, Frank, Nock, Richard, Carioni, Marcello

arXiv.org Machine LearningFeb-9-2016

We prove that the empirical risk of most well-known loss functions factors into a linear term aggregating all labels with a term that is label free, and can further be expressed by sums of the loss. This holds true even for non-smooth, non-convex losses and in any RKHS. The first term is a (kernel) mean operator --the focal quantity of this work-- which we characterize as the sufficient statistic for the labels. The result tightens known generalization bounds and sheds new light on their interpretation. Factorization has a direct application on weakly supervised learning. In particular, we demonstrate that algorithms like SGD and proximal methods can be adapted with minimal effort to handle weak supervision, once the mean operator has been estimated. We apply this idea to learning with asymmetric noisy labels, connecting and extending prior work. Furthermore, we show that most losses enjoy a data-dependent (by the mean operator) form of noise robustness, in contrast with known negative results.

inductive learning, mean operator, neural network, (18 more...)

arXiv.org Machine Learning

1602.0245

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

(Almost) No Label No Cry

Patrini, Giorgio, Nock, Richard, Rivera, Paul, Caetano, Tiberio

Neural Information Processing SystemsDec-31-2014

In Learning with Label Proportions (LLP), the objective is to learn a supervised classifier when, instead of labels, only label proportions for bags of observations are known. This setting has broad practical relevance, in particular for privacy preserving data processing. We first show that the mean operator, a statistic which aggregates all labels, is minimally sufficient for the minimization of many proper scoring losses with linear (or kernelized) classifiers without using labels. We provide a fast learning algorithm that estimates the mean operator via a manifold regularizer with guaranteed approximation bounds. Then, we present an iterative learning algorithm that uses this as initialization. We ground this algorithm in Rademacher-style generalization bounds that fit the LLP setting, introducing a generalization of Rademacher complexity and a Label Proportion Complexity measure. This latter algorithm optimizes tractable bounds for the corresponding bag-empirical risk. Experiments are provided on fourteen domains, whose size ranges up to 300K observations. They display that our algorithms are scalable and tend to consistently outperform the state of the art in LLP. Moreover, in many cases, our algorithms compete with or are just percents of AUC away from the Oracle that learns knowing all labels. On the largest domains, half a dozen proportions can suffice, i.e. roughly 40K times less than the total number of labels.

artificial intelligence, classifier, machine learning, (14 more...)

Neural Information Processing Systems

Country: Oceania > Australia > New South Wales (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback