AITopics

2503.09746

Country: North America > Canada > Quebec (0.16)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
(2 more...)

arXiv.org Artificial IntelligenceFeb-11-2025

SEMU: Singular Value Decomposition for Efficient Machine Unlearning

Sendera, Marcin, Struski, Łukasz, Książek, Kamil, Musiol, Kryspin, Tabor, Jacek, Rymarczyk, Dawid

While the capabilities of generative foundational models have advanced rapidly in recent years, methods to prevent harmful and unsafe behaviors remain underdeveloped. Among the pressing challenges in AI safety, machine unlearning (MU) has become increasingly critical to meet upcoming safety regulations. Most existing MU approaches focus on altering the most significant parameters of the model. However, these methods often require fine-tuning substantial portions of the model, resulting in high computational costs and training instabilities, which are typically mitigated by access to the original training dataset. In this work, we address these limitations by leveraging Singular Value Decomposition (SVD) to create a compact, low-dimensional projection that enables the selective forgetting of specific data points. We propose Singular Value Decomposition for Efficient Machine Unlearning (SEMU), a novel approach designed to optimize MU in two key aspects. First, SEMU minimizes the number of model parameters that need to be modified, effectively removing unwanted knowledge while making only minimal changes to the model's weights. Second, SEMU eliminates the dependency on the original training dataset, preserving the model's previously acquired knowledge without additional data requirements. Extensive experiments demonstrate that SEMU achieves competitive performance while significantly improving efficiency in terms of both data usage and the number of modified parameters.

artificial intelligence, machine learning, natural language, (16 more...)

2502.07587

Country: North America > United States (0.14)

Genre: Research Report > Promising Solution (0.88)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Artificial IntelligenceFeb-10-2025

Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models

Venkatraman, Siddarth, Hasan, Mohsin, Kim, Minsu, Scimeca, Luca, Sendera, Marcin, Bengio, Yoshua, Berseth, Glen, Malkin, Nikolay

Any well-behaved generative model over a variable $\mathbf{x}$ can be expressed as a deterministic transformation of an exogenous ('outsourced') Gaussian noise variable $\mathbf{z}$: $\mathbf{x}=f_\theta(\mathbf{z})$. In such a model (e.g., a VAE, GAN, or continuous-time flow-based model), sampling of the target variable $\mathbf{x} \sim p_\theta(\mathbf{x})$ is straightforward, but sampling from a posterior distribution of the form $p(\mathbf{x}\mid\mathbf{y}) \propto p_\theta(\mathbf{x})r(\mathbf{x},\mathbf{y})$, where $r$ is a constraint function depending on an auxiliary variable $\mathbf{y}$, is generally intractable. We propose to amortize the cost of sampling from such posterior distributions with diffusion models that sample a distribution in the noise space ($\mathbf{z}$). These diffusion samplers are trained by reinforcement learning algorithms to enforce that the transformed samples $f_\theta(\mathbf{z})$ are distributed according to the posterior in the data space ($\mathbf{x}$). For many models and constraints of interest, the posterior in the noise space is smoother than the posterior in the data space, making it more amenable to such amortized inference. Our method enables conditional sampling under unconditional GAN, (H)VAE, and flow-based priors, comparing favorably both with current amortized and non-amortized inference methods. We demonstrate the proposed outsourced diffusion sampling in several experiments with large pretrained prior models: conditional image generation, reinforcement learning with human feedback, and protein structure generation.

machine learning, posterior, reinforcement learning, (11 more...)

2502.06999

Country: North America > Canada > Quebec (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.74)

arXiv.org Machine LearningJan-10-2025

From discrete-time policies to continuous-time diffusion samplers: Asymptotic equivalences and faster training

Berner, Julius, Richter, Lorenz, Sendera, Marcin, Rector-Brooks, Jarrid, Malkin, Nikolay

We study the problem of training neural stochastic differential equations, or diffusion models, to sample from a Boltzmann distribution without access to target samples. Existing methods for training such models enforce time-reversal of the generative and noising processes, using either differentiable simulation or off-policy reinforcement learning (RL). We prove equivalences between families of objectives in the limit of infinitesimal discretization steps, linking entropic RL methods (GFlowNets) with continuous-time objects (partial differential equations and path space measures). We further show that an appropriate choice of coarse time discretization during training allows greatly improved sample efficiency and the use of time-local objectives, achieving competitive performance on standard sampling benchmarks with reduced computational cost.

artificial intelligence, discretization, machine learning, (15 more...)

2501.06148

Country:

North America > United States (0.28)
North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Machine LearningOct-21-2024

High-Fidelity Transfer of Functional Priors for Wide Bayesian Neural Networks by Learning Activations

Sendera, Marcin, Sorkhei, Amin, Kuśmierczyk, Tomasz

Function-space priors in Bayesian Neural Networks provide a more intuitive approach to embedding beliefs directly into the model's output, thereby enhancing regularization, uncertainty quantification, and risk-aware decision-making. However, imposing function-space priors on BNNs is challenging. We address this task through optimization techniques that explore how trainable activations can accommodate complex priors and match intricate target function distributions. We discuss critical learning challenges, including identifiability, loss construction, and symmetries that arise in this context. Furthermore, we enable evidence maximization to facilitate model selection by conditioning the functional priors on additional hyperparameters. Our empirical findings demonstrate that even BNNs with a single wide hidden layer, when equipped with these adaptive trainable activations and conditioning strategies, can effectively achieve high-fidelity function-space priors, providing a robust and flexible framework for enhancing Bayesian neural network performance.

activation, artificial intelligence, machine learning, (16 more...)

2410.15777

Country: North America (0.28)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceMay-31-2024

Amortizing intractable inference in diffusion models for vision, language, and control

Venkatraman, Siddarth, Jain, Moksh, Scimeca, Luca, Kim, Minsu, Sendera, Marcin, Hasan, Mohsin, Rowe, Luke, Mittal, Sarthak, Lemos, Pablo, Bengio, Emmanuel, Adam, Alexandre, Rector-Brooks, Jarrid, Bengio, Yoshua, Berseth, Glen, Malkin, Nikolay

Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data, $\mathbf{x}\sim p^{\rm post}(\mathbf{x})\propto p(\mathbf{x})r(\mathbf{x})$, in a model that consists of a diffusion generative model prior $p(\mathbf{x})$ and a black-box constraint or likelihood function $r(\mathbf{x})$. We state and prove the asymptotic correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from this posterior, a problem that existing methods solve only approximately or in restricted cases. Relative trajectory balance arises from the generative flow network perspective on diffusion models, which allows the use of deep reinforcement learning techniques to improve mode coverage. Experiments illustrate the broad potential of unbiased inference of arbitrary posteriors under diffusion priors: in vision (classifier guidance), language (infilling under a discrete diffusion LLM), and multimodal data (text-to-image generation). Beyond generative modeling, we apply relative trajectory balance to the problem of continuous control with a score-based behavior prior, achieving state-of-the-art results on benchmarks in offline reinforcement learning.

diffusion model, machine learning, reinforcement learning, (18 more...)

2405.20971

Country: North America > Canada > Quebec (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

arXiv.org Artificial IntelligenceFeb-8-2024

Iterated Denoising Energy Matching for Sampling from Boltzmann Densities

Akhound-Sadegh, Tara, Rector-Brooks, Jarrid, Bose, Avishek Joey, Mittal, Sarthak, Lemos, Pablo, Liu, Cheng-Hao, Sendera, Marcin, Ravanbakhsh, Siamak, Gidel, Gauthier, Bengio, Yoshua, Malkin, Nikolay, Tong, Alexander

Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and no data samples -- to train a diffusion-based sampler. Specifically, iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our stochastic matching objective to further improve the sampler. iDEM is scalable to high dimensions as the inner matching objective, is simulation-free, and requires no MCMC samples. Moreover, by leveraging the fast mode mixing behavior of diffusion, iDEM smooths out the energy landscape enabling efficient exploration and learning of an amortized sampler. We evaluate iDEM on a suite of tasks ranging from standard synthetic energy functions to invariant $n$-body particle systems. We show that the proposed approach achieves state-of-the-art performance on all metrics and trains $2-5\times$ faster, which allows it to be the first method to train using energy on the challenging $55$-particle Lennard-Jones system.

artificial intelligence, iterated denoising energy matching, machine learning, (15 more...)

2402.06121

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

arXiv.org Artificial IntelligenceFeb-7-2024

On diffusion models for amortized inference: Benchmarking and improving stochastic control and sampling

Sendera, Marcin, Kim, Minsu, Mittal, Sarthak, Lemos, Pablo, Scimeca, Luca, Rector-Brooks, Jarrid, Adam, Alexandre, Bengio, Yoshua, Malkin, Nikolay

We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at https://github.com/GFNOrg/gfn-diffusion as a base for future work on diffusion models for amortized inference.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2402.05098

Country: North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Energy > Oil & Gas > Upstream (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
(2 more...)

arXiv.org Machine LearningOct-6-2020

Flow-based anomaly detection

Maziarka, Łukasz, Śmieja, Marek, Sendera, Marcin, Struski, Łukasz, Tabor, Jacek, Spurek, Przemysław

We propose OneFlow - a flow-based one-class classifier for anomaly (outliers) detection that finds a minimal volume bounding region. Contrary to density-based methods, OneFlow is constructed in such a way that its result typically does not depend on the structure of outliers. This is caused by the fact that during training the gradient of the cost function is propagated only over the points located near to the decision boundary (behavior similar to the support vectors in SVM). The combination of flow models and Bernstein quantile estimator allows OneFlow to find a parametric form of bounding region, which can be useful in various applications including describing shapes from 3D point clouds. Experiments show that the proposed model outperforms related methods on real-world anomaly detection problems.

dataset, deep learning, neural network, (18 more...)

2010.03002

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Machine LearningApr-8-2019

Data adaptation in HANDY economy-ideology model

Sendera, Marcin

The concept of mathematical modeling is widespread across almost all of the fields of contemporary science and engineering. Because of the existing necessity of predictions the behavior of natural phenomena, the researchers develop more and more complex models. However, despite their ability to better forecasting, the problem of an appropriate fitting ground truth data to those, high-dimensional and nonlinear models seems to be inevitable. In order to deal with this demanding problem the entire discipline of data assimilation has been developed. Basing on the Human and Nature Dynamics (HANDY) model, we have presented a detailed and comprehensive comparison of Approximate Bayesian Computation (classic data assimilation method) and a novelty approach of Supermodeling. Furthermore, with the usage of Sensitivity Analysis, we have proposed the methodology to reduce the number of coupling coefficients between submodels and as a consequence to increase the speed of the Supermodel converging. In addition, we have demonstrated that usage of Approximate Bayesian Computation method with the knowledge about parameters' sensitivities could result with satisfactory estimation of the initial parameters. However, we have also presented the mentioned methodology as unable to achieve similar predictions to Approximate Bayesian Computation. Finally, we have proved that Supermodeling with synchronization via the most sensitive variable could effect with the better forecasting for chaotic as well as more stable systems than the Approximate Bayesian Computation. What is more, we have proposed the adequate methodologies.

oncology, submodel, us government, (23 more...)

1904.04309

Country:

North America > United States (0.67)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.45)
Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)