AITopics

doi: 10.1007/978-3-031-73202-7_7

2503.10191

Genre: Research Report > Promising Solution (0.48)

Industry:

Information Technology > Security & Privacy (0.92)
Government > Military (0.60)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

arXiv.org Machine LearningMar-10-2025

TwinTURBO: Semi-Supervised Fine-Tuning of Foundation Models via Mutual Information Decompositions for Downstream Task and Latent Spaces

Quétant, Guillaume, Molchanov, Pavlo, Voloshynovskiy, Slava

Foundation models are large-scale neural networks pre-trained on diverse data to learn generalpurpose representations that can be fine-tuned for specific downstream tasks. This poses significant challenges, especially in the case of low-labelled data, a semi-supervised learning setting where only a small fraction of the data samples are labelled, while the majority remain unlabelled. While foundation models are pre-trained on large datasets in a self-supervised manner, their deployment often requires fine-tuning on new datasets with limited labelled samples and potential distribution shifts. Furthermore, the downstream tasks frequently differ from the pre-training objectives, complicating the adaptation process. Existing semi-supervised approaches, such as pseudo-labelling, rely heavily on assumptions about data distributions or task-specific tuning, limiting their generalisability. Addressing these challenges is essential to fully exploit the potential of foundation models and ensure their adaptability and scalability in diverse applications. The main contributions of this study are: A new framework for foundation models fine-tuning: We introduces a fine-tuning strategy based on mutual information decomposition.

artificial intelligence, deep learning, machine learning, (15 more...)

2503.07851

Country: Europe (0.14)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMar-5-2025

Task-Agnostic Attacks Against Vision Foundation Models

Pulfer, Brian, Belousov, Yury, Kinakh, Vitaliy, Furon, Teddy, Voloshynovskiy, Slava

The study of security in machine learning mainly focuses on downstream task-specific attacks, where the adversarial example is obtained by optimizing a loss function specific to the downstream task. At the same time, it has become standard practice for machine learning practitioners to adopt publicly available pre-trained vision foundation models, effectively sharing a common backbone architecture across a multitude of applications such as classification, segmentation, depth estimation, retrieval, question-answering and more. The study of attacks on such foundation models and their impact to multiple downstream tasks remains vastly unexplored. This work proposes a general framework that forges task-agnostic adversarial examples by maximally disrupting the feature representation obtained with foundation models. W e extensively evaluate the security of the feature representations obtained by popular vision foundation models by measuring the impact of this attack on multiple downstream tasks and its transferability between models.

artificial intelligence, machine learning, natural language, (18 more...)

2503.03842

Country:

North America > Canada (0.14)
North America > United States (0.14)
Europe > France (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.66)

arXiv.org Artificial IntelligenceOct-28-2024

Tabular Data Generation using Binary Diffusion

Kinakh, Vitaliy, Voloshynovskiy, Slava

Generating synthetic tabular data is critical in machine learning, especially when real data is limited or sensitive. Traditional generative models often face challenges due to the unique characteristics of tabular data, such as mixed data types and varied distributions, and require complex preprocessing or large pretrained models. In this paper, we introduce a novel, lossless binary transformation method that converts any tabular data into fixed-size binary representations, and a corresponding new generative model called Binary Diffusion, specifically designed for binary data. Binary Diffusion leverages the simplicity of XOR operations for noise addition and removal and employs binary cross-entropy loss for training. Our approach eliminates the need for extensive preprocessing, complex noise parameter tuning, and pretraining on large datasets. We evaluate our model on several popular tabular benchmark datasets, demonstrating that Binary Diffusion outperforms existing state-of-the-art models on Travel, Adult Income, and Diabetes datasets while being significantly smaller in size.

latexit sha1, machine learning, natural language, (18 more...)

2409.13882

Country: Europe > Switzerland (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.89)

arXiv.org Artificial IntelligenceOct-4-2024

Semi-Supervised Fine-Tuning of Vision Foundation Models with Content-Style Decomposition

Drozdova, Mariia, Kinakh, Vitaliy, Belousov, Yury, Lastufka, Erica, Voloshynovskiy, Slava

In this paper, we present a semi-supervised fine-tuning approach designed to improve the performance of pre-trained foundation models on downstream tasks with limited labeled data. By leveraging content-style decomposition within an information-theoretic framework, our method enhances the latent representations of pre-trained vision foundation models, aligning them more effectively with specific task objectives and addressing the problem of distribution shift. We evaluate our approach on multiple datasets, including MNIST, its augmented variations (with yellow and white stripes), CIFAR-10, SVHN, and GalaxyMNIST. The experiments show improvements over supervised finetuning baseline of pre-trained models, particularly in low-labeled data regimes, across both frozen and trainable backbones for the majority of the tested datasets.

artificial intelligence, image understanding, machine learning, (20 more...)

2410.02069

Country: Europe > Switzerland (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.46)

arXiv.org Artificial IntelligenceFeb-20-2024

Radio-astronomical Image Reconstruction with Conditional Denoising Diffusion Model

Drozdova, Mariia, Kinakh, Vitaliy, Bait, Omkar, Taran, Olga, Lastufka, Erica, Dessauges-Zavadsky, Miroslava, Holotyak, Taras, Schaerer, Daniel, Voloshynovskiy, Slava

Reconstructing sky models from dirty radio images for accurate source localization and flux estimation is crucial for studying galaxy evolution at high redshift, especially in deep fields using instruments like the Atacama Large Millimetre Array (ALMA). With new projects like the Square Kilometre Array (SKA), there's a growing need for better source extraction methods. Current techniques, such as CLEAN and PyBDSF, often fail to detect faint sources, highlighting the need for more accurate methods. This study proposes using stochastic neural networks to rebuild sky models directly from dirty images. This method can pinpoint radio sources and measure their fluxes with related uncertainties, marking a potential improvement in radio source characterization. We tested this approach on 10164 images simulated with the CASA tool simalma, based on ALMA's Cycle 5.3 antenna setup. We applied conditional Denoising Diffusion Probabilistic Models (DDPMs) for sky models reconstruction, then used Photutils to determine source coordinates and fluxes, assessing the model's performance across different water vapor levels. Our method showed excellence in source localization, achieving more than 90% completeness at a signal-to-noise ratio (SNR) as low as 2. It also surpassed PyBDSF in flux estimation, accurately identifying fluxes for 96% of sources in the test set, a significant improvement over CLEAN+ PyBDSF's 57%. Conditional DDPMs is a powerful tool for image-to-image translation, yielding accurate and robust characterisation of radio sources, and outperforming existing methodologies. While this study underscores its significant potential for applications in radio astronomy, we also acknowledge certain limitations that accompany its usage, suggesting directions for further refinement and research.

artificial intelligence, machine learning, sky model, (19 more...)

doi: 10.1051/0004-6361/202347948

2402.10204

Country:

North America > United States (0.46)
North America > Canada > Alberta > Census Division No. 13 > Woodlands County (0.24)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceNov-11-2023

TURBO: The Swiss Knife of Auto-Encoders

Quétant, Guillaume, Belousov, Yury, Kinakh, Vitaliy, Voloshynovskiy, Slava

We present a novel information-theoretic framework, termed as TURBO, designed to systematically analyse and generalise auto-encoding methods. We start by examining the principles of information bottleneck and bottleneck-based networks in the auto-encoding setting and identifying their inherent limitations, which become more prominent for data with multiple relevant, physics-related representations. The TURBO framework is then introduced, providing a comprehensive derivation of its core concept consisting of the maximisation of mutual information between various data representations expressed in two directions reflecting the information flows. We illustrate that numerous prevalent neural network models are encompassed within this framework. The paper underscores the insufficiency of the information bottleneck concept in elucidating all such models, thereby establishing TURBO as a preferable theoretical reference. The introduction of TURBO contributes to a richer understanding of data representation and the structure of neural network models, enabling more efficient and versatile applications.

artificial intelligence, information, machine learning, (18 more...)

doi: 10.3390/e25101471

2311.06527

Country:

North America > United States (0.93)
Europe (0.93)
Asia > Middle East > Israel (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningDec-21-2021

Turbo-Sim: a generalised generative model with a physical latent space

Quétant, Guillaume, Drozdova, Mariia, Kinakh, Vitaliy, Golling, Tobias, Voloshynovskiy, Slava

We present Turbo-Sim, a generalised autoencoder framework derived from principles of information theory that can be used as a generative model. By maximising the mutual information between the input and the output of both the encoder and the decoder, we are able to rediscover the loss terms usually found in adversarial autoencoders and generative adversarial networks, as well as various more sophisticated related models. Our generalised framework makes these models mathematically interpretable and allows for a diversity of new ones by setting the weight of each loss term separately. The framework is also independent of the intrinsic architecture of the encoder and the decoder thus leaving a wide choice for the building blocks of the whole network. We apply Turbo-Sim to a collider physics generation problem: the transformation of the properties of several particles from a theory space, right after the collision, to an observation space, right after the detection in an experiment.

artificial intelligence, machine learning, mutual information, (19 more...)

2112.10629

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningDec-15-2021

Funnels: Exact maximum likelihood with dimensionality reduction

Klein, Samuel, Raine, John A., Pina-Otey, Sebastian, Voloshynovskiy, Slava, Golling, Tobias

Normalizing flows are diffeomorphic, typically dimension-preserving, models trained using the likelihood of the model. We use the SurVAE framework to construct dimension reducing surjective flows via a new layer, known as the funnel. We demonstrate its efficacy on a variety of datasets, and show it improves upon or matches the performance of existing flows while having a reduced latent space size. The funnel layer can be constructed from a wide range of transformations including restricted convolution and feed forward layers.

artificial intelligence, bayesian inference, machine learning, (15 more...)

2112.08069

Genre: Research Report (0.50)

arXiv.org Machine LearningSep-13-2019

$\rho$-VAE: Autoregressive parametrization of the VAE encoder

Ferdowsi, Sohrab, Diephuis, Maurits, Rezaeifar, Shideh, Voloshynovskiy, Slava

We make a minimal, but very effective alteration to the VAE model. This is about a drop-in replacement for the (sample-dependent) approximate posterior to change it from the standard white Gaussian with diagonal covariance to the first-order autoregressive Gaussian. We argue that this is a more reasonable choice to adopt for natural signals like images, as it does not force the existing correlation in the data to disappear in the posterior. Moreover, it allows more freedom for the approximate posterior to match the true posterior. This allows for the repararametrization trick, as well as the KL-divergence term to still have closed-form expressions, obviating the need for its sample-based estimation. Although providing more freedom to adapt to correlated distributions, our parametrization has even less number of parameters than the diagonal covariance, as it requires only two scalars, $\rho$ and $s$, to characterize correlation and scaling, respectively. As validated by the experiments, our proposition noticeably and consistently improves the quality of image generation in a plug-and-play manner, needing no further parameter tuning, and across all setups. The code to reproduce our experiments is available at \url{https://github.com/sssohrab/rho_VAE/}.

artificial intelligence, machine learning, neural network, (16 more...)

1909.06236

Country:

Asia (0.14)
Europe > Switzerland (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)