Goto

Collaborating Authors

 dsd


Gaussian Processes for Shuffled Regression

Neural Information Processing Systems

Shuffled regression is the problem of learning regression functions from shuffled data where the correspondence between the input features and target response is unknown. This paper proposes a probabilistic model for shuffled regression called Gaussian Process Shuffled Regression (GPSR). By introducing Gaussian processes as a prior of regression functions in function space via the kernel function, GPSR can express a wide variety of functions in a nonparametric manner while quantifying the uncertainty of the prediction. By adopting the Bayesian evidence maximization framework and a theoretical analysis of the connection between the marginal likelihood/predictive distribution of GPSR and that of standard Gaussian process regression (GPR), we derive an easy-to-implement inference algorithm for GPSR that iteratively applies GPR and updates the input-output correspondence. To reduce computation costs and obtain closed-form solutions for correspondence updates, we also develop a sparse approximate variant of GPSR using its weight space formulation, which can be seen as Bayesian shuffled linear regression with random Fourier features. Experiments on benchmark datasets confirm the effectiveness of our GPSR proposal.


Discrete Spatial Diffusion: Intensity-Preserving Diffusion Modeling

Neural Information Processing Systems

Generative diffusion models have achieved remarkable success in producing high-quality images. However, these models typically operate in continuous intensity spaces, diffusing independently across pixels and color channels. As a result, they are fundamentally ill-suited for applications involving inherently discrete quantities such as particle counts or material units, that are constrained by strict conservation laws like mass conservation, limiting their applicability in scientific workflows. To address this limitation, we propose Discrete Spatial Diffusion (DSD), a framework based on a continuous-time, discrete-state jump stochastic process that operates directly in discrete spatial domains while strictly preserving particle counts in both forward and reverse diffusion processes. By using spatial diffusion to achieve particle conservation, we introduce stochasticity naturally through a discrete formulation. We demonstrate the expressive flexibility of DSD by performing image synthesis, class conditioning, and image inpainting across standard image benchmarks, while exactly conditioning total image intensity. We validate DSD on two challenging scientific applications: porous rock microstructures and lithium-ion battery electrodes, demonstrating its ability to generate structurally realistic samples under strict mass conservation constraints, with quantitative evaluation using state-of-the-art metrics for transport and electrochemical performance.


Deep Self-Dissimilarities as Powerful Visual Fingerprints

Neural Information Processing Systems

Features extracted from deep layers of classification networks are widely used as image descriptors. Here, we exploit an unexplored property of these features: their internal dissimilarity. While small image patches are known to have similar statistics across image scales, it turns out that the internal distribution of deep features varies distinctively between scales. We show how this deep self dissimilarity (DSD) property can be used as a powerful visual fingerprint. Particularly, we illustrate that full-reference and no-reference image quality measures derived from DSD are highly correlated with human preference. In addition, incorporating DSD as a loss function in training of image restoration networks, leads to results that are at least as photo-realistic as those obtained by GAN based methods, while not requiring adversarial training.





Machine-Precision Prediction of Low-Dimensional Chaotic Systems

arXiv.org Artificial Intelligence

Low-dimensional chaotic systems such as the Lorenz-63 model are commonly used to benchmark system-agnostic methods for learning dynamics from data. Here we show that learning from noise-free observations in such systems can be achieved up to machine precision: using ordinary least squares regression on high-degree polynomial features with 512-bit arithmetic, our method exceeds the accuracy of standard 64-bit numerical ODE solvers of the true underlying dynamical systems. Depending on the configuration, we obtain valid prediction times of 32 to 105 Lyapunov times for the Lorenz-63 system, dramatically outperforming prior work that reaches 13 Lyapunov times at most. We further validate our results on Thomas' Cyclically Symmetric Attractor, a non-polynomial chaotic system that is considerably more complex than the Lorenz-63 model, and show that similar results extend also to higher dimensions using the spatiotemporally chaotic Lorenz-96 model. Our findings suggest that learning low-dimensional chaotic systems from noise-free data is a solved problem.


Improving Label Assignments Learning by Dynamic Sample Dropout Combined with Layer-wise Optimization in Speech Separation

arXiv.org Artificial Intelligence

In supervised speech separation, permutation invariant training (PIT) is widely used to handle label ambiguity by selecting the best permutation to update the model. Despite its success, previous studies showed that PIT is plagued by excessive label assignment switching in adjacent epochs, impeding the model to learn better label assignments. To address this issue, we propose a novel training strategy, dynamic sample dropout (DSD), which considers previous best label assignments and evaluation metrics to exclude the samples that may negatively impact the learned label assignments during training. Additionally, we include layer-wise optimization (LO) to improve the performance by solving layer-decoupling. Our experiments showed that combining DSD and LO outperforms the baseline and solves excessive label assignment switching and layer-decoupling issues. The proposed DSD and LO approach is easy to implement, requires no extra training sets or steps, and shows generality to various speech separation tasks.


Understanding and Visualizing Droplet Distributions in Simulations of Shallow Clouds

arXiv.org Artificial Intelligence

Thorough analysis of local droplet-level interactions is crucial to better understand the microphysical processes in clouds and their effect on the global climate. High-accuracy simulations of relevant droplet size distributions from Large Eddy Simulations (LES) of bin microphysics challenge current analysis techniques due to their high dimensionality involving three spatial dimensions, time, and a continuous range of droplet sizes. Utilizing the compact latent representations from Variational Autoencoders (VAEs), we produce novel and intuitive visualizations for the organization of droplet sizes and their evolution over time beyond what is possible with clustering techniques. This greatly improves interpretation and allows us to examine aerosol-cloud interactions by contrasting simulations with different aerosol concentrations. We find that the evolution of the droplet spectrum is similar across aerosol levels but occurs at different paces. This similarity suggests that precipitation initiation processes are alike despite variations in onset times.


STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning

arXiv.org Machine Learning

Directed Exploration is a crucial challenge in reinforcement learning (RL), especially when rewards are sparse. Information-directed sampling (IDS), which optimizes the information ratio, seeks to do so by augmenting regret with information gain. However, estimating information gain is computationally intractable or relies on restrictive assumptions which prohibit its use in many practical instances. In this work, we posit an alternative exploration incentive in terms of the integral probability metric (IPM) between a current estimate of the transition model and the unknown optimal, which under suitable conditions, can be computed in closed form with the kernelized Stein discrepancy (KSD). Based on KSD, we develop a novel algorithm \algo: \textbf{STE}in information dir\textbf{E}cted exploration for model-based \textbf{R}einforcement Learn\textbf{ING}. To enable its derivation, we develop fundamentally new variants of KSD for discrete conditional distributions. {We further establish that {\algo} archives sublinear Bayesian regret, improving upon prior learning rates of information-augmented MBRL.} Experimentally, we show that the proposed algorithm is computationally affordable and outperforms several prior approaches.