assumption 3
Sampling from Constrained Gibbs Measures: with Applications to High-Dimensional Bayesian Inference
Wang, Ruixiao, Chen, Xiaohong, Chewi, Sinho
This paper considers a non-standard problem of generating samples from a low-temperature Gibbs distribution with \emph{constrained} support, when some of the coordinates of the mode lie on the boundary. These coordinates are referred to as the non-regular part of the model. We show that in a ``pre-asymptotic'' regime in which the limiting Laplace approximation is not yet valid, the low-temperature Gibbs distribution concentrates on a neighborhood of its mode. Within this region, the distribution is a bounded perturbation of a product measure: a strongly log-concave distribution in the regular part and a one-dimensional exponential-type distribution in each coordinate of the non-regular part. Leveraging this structure, we provide a non-asymptotic sampling guarantee by analyzing the spectral gap of Langevin dynamics. Key examples of low-temperature Gibbs distributions include Bayesian posteriors, and we demonstrate our results on three canonical examples: a high-dimensional logistic regression model, a Poisson linear model, and a Gaussian mixture model.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New Jersey > Bergen County > Hackensack (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)
Scaling Laws for Precision in High-Dimensional Linear Regression
Zhang, Dechen, Tang, Xuan, Liang, Yingyu, Zou, Difan
Low-precision training is critical for optimizing the trade-off between model quality and training costs, necessitating the joint allocation of model size, dataset size, and numerical precision. While empirical scaling laws suggest that quantization impacts effective model and data capacities or acts as an additive error, the theoretical mechanisms governing these effects remain largely unexplored. In this work, we initiate a theoretical study of scaling laws for low-precision training within a high-dimensional sketched linear regression framework. By analyzing multiplicative (signal-dependent) and additive (signal-independent) quantization, we identify a critical dichotomy in their scaling behaviors. Our analysis reveals that while both schemes introduce an additive error and degrade the effective data size, they exhibit distinct effects on effective model size: multiplicative quantization maintains the full-precision model size, whereas additive quantization reduces the effective model size. Numerical experiments validate our theoretical findings. By rigorously characterizing the complex interplay among model scale, dataset size, and quantization error, our work provides a principled theoretical basis for optimizing training protocols under practical hardware constraints.
- North America > United States > California > Alameda County > Berkeley (0.04)
- North America > Canada (0.04)
- Asia > Russia (0.28)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
- Asia > Middle East > Saudi Arabia (0.04)
- Asia > Middle East > Jordan (0.04)
Nonparametric Identification and Inference for Counterfactual Distributions with Confounding
We propose nonparametric identification and semiparametric estimation of joint potential outcome distributions in the presence of confounding. First, in settings with observed confounding, we derive tighter, covariate-informed bounds on the joint distribution by leveraging conditional copulas. To overcome the non-differentiability of bounding min/max operators, we establish the asymptotic properties for both a direct estimator with polynomial margin condition and a smooth approximation with log-sum-exp operator, facilitating valid inference for individual-level effects under the canonical rank-preserving assumption. Second, we tackle the challenge of unmeasured confounding by introducing a causal representation learning framework. By utilizing instrumental variables, we prove the nonparametric identifiability of the latent confounding subspace under injectivity and completeness conditions. We develop a ``triple machine learning" estimator that employs cross-fitting scheme to sequentially handle the learned representation, nuisance parameters, and target functional. We characterize the asymptotic distribution with variance inflation induced by representation learning error, and provide conditions for semiparametric efficiency. We also propose a practical VAE-based algorithm for confounding representation learning. Simulations and real-world analysis validate the effectiveness of proposed methods. By bridging classical semiparametric theory with modern representation learning, this work provides a robust statistical foundation for distributional and counterfactual inference in complex causal systems.
- Europe > United Kingdom > England > West Sussex (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > Japan (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Colorado > Boulder County > Boulder (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Software (0.67)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Research Report > Experimental Study (0.92)
- Research Report > New Finding (0.67)
- Government (1.00)
- Law (0.67)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada > British Columbia (0.04)