figurea
Scalable Simulation-Based Model Inference with Test-Time Complexity Control
Gloeckler, Manuel, Manzano-Patrón, J. P., Sotiropoulos, Stamatios N., Schröder, Cornelius, Macke, Jakob H.
Simulation plays a central role in scientific discovery. In many applications, the bottleneck is no longer running a simulator; it is choosing among large families of plausible simulators, each corresponding to different forward models/hypotheses consistent with observations. Over large model families, classical Bayesian workflows for model selection are impractical. Furthermore, amortized model selection methods typically hard-code a fixed model prior or complexity penalty at training time, requiring users to commit to a particular parsimony assumption before seeing the data. We introduce PRISM, a simulation-based encoder-decoder that infers a joint posterior over both discrete model structures and associated continuous parameters, while enabling test-time control of model complexity via a tunable model prior that the network is conditioned on. We show that PRISM scales to families with combinatorially many (up to billions) of model instantiations on a synthetic symbolic regression task. As a scientific application, we evaluate PRISM on biophysical modeling for diffusion MRI data, showing the ability to perform model selection across several multi-compartment models, on both synthetic and in vivo neuroimaging data.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- Asia > Middle East > Jordan (0.04)
- (2 more...)
- Health & Medicine > Therapeutic Area > Neurology (0.86)
- Health & Medicine > Diagnostic Medicine > Imaging (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
A theory of learning data statistics in diffusion models, from easy to hard
Bardone, Lorenzo, Merger, Claudia, Goldt, Sebastian
While diffusion models have emerged as a powerful class of generative models, their learning dynamics remain poorly understood. We address this issue first by empirically showing that standard diffusion models trained on natural images exhibit a distributional simplicity bias, learning simple, pair-wise input statistics before specializing to higher-order correlations. We reproduce this behaviour in simple denoisers trained on a minimal data model, the mixed cumulant model, where we precisely control both pair-wise and higher-order correlations of the inputs. We identify a scalar invariant of the model that governs the sample complexity of learning pair-wise and higher-order correlations that we call the diffusion information exponent, in analogy to related invariants in different learning paradigms. Using this invariant, we prove that the denoiser learns simple, pair-wise statistics of the inputs at linear sample complexity, while more complex higher-order statistics, such as the fourth cumulant, require at least cubic sample complexity. We also prove that the sample complexity of learning the fourth cumulant is linear if pair-wise and higher-order statistics share a correlated latent structure. Our work describes a key mechanism for how diffusion models can learn distributions of increasing complexity.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
SupplementaryMaterialfor: AdversarialRegression withDoubly Non-negativeWeightingMatrices
A.1 ProofsofSection3 In the following, the symbolh, i will be used to represent both Frobenius norm of matrices and standard Euclidean norm of vectors. For the second part, letv be an eigenvector ofA corresponding to eigenvalueλmax(A). Incase the maximum eigenvalue ofT isnonpositive, then from Lemma A.1 we see that the objectivevalue of problem(A.2)evaluated For anp preal matrixA, its spectral radiusR(A)is defined as the largest absolute value of its eigenvalues. Then the matrixI A is invertible and all entries of(I A) 1 are nonnegative. Also the spectral radius of(γ?) 1bΩ12V(β)bΩ12 is smaller than1 by the feasibility ofγ? in problem (A.5c).