AITopics | Sautière, Guillaume

Collaborating Authors

Sautière, Guillaume

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Region-of-Interest Based Neural Video Compression

Perugachi-Diaz, Yura, Sautière, Guillaume, Abati, Davide, Yang, Yang, Habibian, Amirhossein, Cohen, Taco S

arXiv.org Artificial IntelligenceNov-2-2022

Humans do not perceive all parts of a scene with the same resolution, but rather focus on few regions of interest (ROIs). Traditional Object-Based codecs take advantage of this biological intuition, and are capable of non-uniform allocation of bits in favor of salient regions, at the expense of increased distortion the remaining areas: such a strategy allows a boost in perceptual quality under low rate constraints. Recently, several neural codecs have been introduced for video compression, yet they operate uniformly over all spatial locations, lacking the capability of ROI-based processing. In this paper, we introduce two models for ROI-based neural video coding. First, we propose an implicit model that is fed with a binary ROI mask and it is trained by de-emphasizing the distortion of the background. Secondly, we design an explicit latent scaling method, that allows control over the quantization binwidth for different spatial regions of latent variables, conditioned on the ROI mask. By extensive experiments, we show that our methods outperform all our baselines in terms of Rate-Distortion (R-D) performance in the ROI. Moreover, they can generalize to different datasets and to any arbitrary ROI at inference time. Finally, they do not require expensive pixel-level annotations during training, as synthetic ROI masks can be used with little to no degradation in performance. To the best of our knowledge, our proposals are the first solutions that integrate ROI-based capabilities into neural video compression models.

artificial intelligence, machine learning, roi ssf, (15 more...)

arXiv.org Artificial Intelligence

2203.01978

Country: Europe (0.67)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Lossy Compression with Distortion Constrained Optimization

van Rozendaal, Ties, Sautière, Guillaume, Cohen, Taco S.

arXiv.org Machine LearningMay-8-2020

When training end-to-end learned models for lossy compression, one has to balance the rate and distortion losses. This is typically done by manually setting a tradeoff parameter $\beta$, an approach called $\beta$-VAE. Using this approach it is difficult to target a specific rate or distortion value, because the result can be very sensitive to $\beta$, and the appropriate value for $\beta$ depends on the model and problem setup. As a result, model comparison requires extensive per-model $\beta$-tuning, and producing a whole rate-distortion curve (by varying $\beta$) for each model to be compared. We argue that the constrained optimization method of Rezende and Viola, 2018 is a lot more appropriate for training lossy compression models because it allows us to obtain the best possible rate subject to a distortion constraint. This enables pointwise model comparisons, by training two models with the same distortion target and comparing their rate. We show that the method does manage to satisfy the constraint on a realistic image compression task, outperforms a constrained optimization method based on a hinge-loss, and is more practical to use for model selection than a $\beta$-VAE.

compression, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

2005.04064

Country: Europe > Germany (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback