conditional density estimation
- North America > United States > New York (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Partition Trees: Conditional Density Estimation over General Outcome Spaces
Angelim, Felipe, Leite, Alessandro
We propose Partition Trees, a tree-based framework for conditional density estimation over general outcome spaces, supporting both continuous and categorical variables within a unified formulation. Our approach models conditional distributions as piecewise-constant densities on data adaptive partitions and learns trees by directly minimizing conditional negative log-likelihood. This yields a scalable, nonparametric alternative to existing probabilistic trees that does not make parametric assumptions about the target distribution. We further introduce Partition Forests, an ensemble extension obtained by averaging conditional densities. Empirically, we demonstrate improved probabilistic prediction over CART-style trees and competitive or superior performance compared to state-of-the-art probabilistic tree methods and Random Forests, along with robustness to redundant features and heteroscedastic noise.
- North America > United States > California (0.05)
- Europe > France > Normandy > Seine-Maritime > Rouen (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.63)
Transforming Conditional Density Estimation Into a Single Nonparametric Regression Task
Reisach, Alexander G., Collier, Olivier, Luedtke, Alex, Chambaz, Antoine
We propose a way of transforming the problem of conditional density estimation into a single nonparametric regression task via the introduction of auxiliary samples. This allows leveraging regression methods that work well in high dimensions, such as neural networks and decision trees. Our main theoretical result characterizes and establishes the convergence of our estimator to the true conditional density in the data limit. We develop condensité, a method that implements this approach. We demonstrate the benefit of the auxiliary samples on synthetic data and showcase that condensité can achieve good out-of-the-box results. We evaluate our method on a large population survey dataset and on a satellite imaging dataset. In both cases, we find that condensité matches or outperforms the state of the art and yields conditional densities in line with established findings in the literature on each dataset. Our contribution opens up new possibilities for regression-based conditional density estimation and the empirical results indicate strong promise for applied research.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Oceania > Australia > Northern Territory (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
Gaussian Process Conditional Density Estimation
Conditional Density Estimation (CDE) models deal with estimating conditional distributions. The conditions imposed on the distribution are the inputs of the model. CDE is a challenging task as there is a fundamental trade-off between model complexity, representational capacity and overfitting. In this work, we propose to extend the model's input with latent variables and use Gaussian processes (GP) to map this augmented input onto samples from the conditional distribution. Our Bayesian approach allows for the modeling of small datasets, but we also provide the machinery for it to be applied to big data using stochastic variational inference. Our approach can be used to model densities even in sparse data regions, and allows for sharing learned structure between conditions. We illustrate the effectiveness and wide-reaching applicability of our model on a variety of real-world problems, such as spatio-temporal density estimation of taxi drop-offs, non-Gaussian noise modeling, and few-shot learning on omniglot images.
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
CINDES: Classification induced neural density estimator and simulator
Dai, Dehao, Fan, Jianqing, Gu, Yihong, Mukherjee, Debarghya
Neural network-based methods for (un)conditional density estimation have recently gained substantial attention, as various neural density estimators have outperformed classical approaches in real-data experiments. Despite these empirical successes, implementation can be challenging due to the need to ensure non-negativity and unit-mass constraints, and theoretical understanding remains limited. In particular, it is unclear whether such estimators can adaptively achieve faster convergence rates when the underlying density exhibits a low-dimensional structure. This paper addresses these gaps by proposing a structure-agnostic neural density estimator that is (i) straightforward to implement and (ii) provably adaptive, attaining faster rates when the true density admits a low-dimensional composition structure. Another key contribution of our work is to show that the proposed estimator integrates naturally into generative sampling pipelines, most notably score-based diffusion models, where it achieves provably faster convergence when the underlying density is structured. We validate its performance through extensive simulations and a real-data application.
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Smoothing-Based Conformal Prediction for Balancing Efficiency and Interpretability
Zheng, Mingyi, Jiang, Hongyu, Lu, Yizhou, Teng, Jiaye
Conformal Prediction (CP) is a distribution-free framework for constructing statistically rigorous prediction sets. While popular variants such as CD-split improve CP's efficiency, they often yield prediction sets composed of multiple disconnected subintervals, which are difficult to interpret. In this paper, we propose SCD-split, which incorporates smoothing operations into the CP framework. Such smoothing operations potentially help merge the subintervals, thus leading to interpretable prediction sets. Experimental results on both synthetic and real-world datasets demonstrate that SCD-split balances the interval length and the number of disconnected subintervals. Theoretically, under specific conditions, SCD-split provably reduces the number of disconnected subintervals while maintaining comparable coverage guarantees and interval length compared with CD-split.
- Asia > Middle East > Jordan (0.04)
- North America > Canada > British Columbia > Vancouver (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
about real-world experiments and deep density models and then answer detailed comments and questions
We thank the reviewers for their very helpful comments and suggestions. The 100% recall but very poor precision (i.e., it always predicts a shift) is expected Marginal-KS is very bad because the attack model is very strong, i.e., it mimics the marginal distribution of Thus, marginal KS will naturally fail--highlighting the limitation of prior work for this adversarial attack. F or bootstrapping, does the model need to be fit multiple times? For Gaussian, this is fairly simple. For the detection stage, the FDR was controlled below 0.05 in all See also Table 6 and 7 in appendix.
Accelerated Bayesian Optimal Experimental Design via Conditional Density Estimation and Informative Data
Huang, Miao, Wang, Hongqiao, Wu, Kunyu
The Design of Experiments (DOEs) is a fundamental scientific methodology that provides researchers with systematic principles and techniques to enhance the validity, reliability, and efficiency of experimental outcomes. In this study, we explore optimal experimental design within a Bayesian framework, utilizing Bayes' theorem to reformulate the utility expectation--originally expressed as a nested double integral--into an independent double integral form, significantly improving numerical efficiency. To further accelerate the computation of the proposed utility expectation, conditional density estimation is employed to approximate the ratio of two Gaussian random fields, while covariance serves as a selection criterion to identify informative data-set during model fitting and integral evaluation. In scenarios characterized by low simulation efficiency and high costs of raw data acquisition, key challenges such as surrogate modeling, failure probability estimation, and parameter inference are systematically restructured within the Bayesian experimental design framework. The effectiveness of the proposed methodology is validated through both theoretical analysis and practical applications, demonstrating its potential for enhancing experimental efficiency and decision-making under uncertainty.
- North America > United States > Indiana > Hamilton County > Fishers (0.04)
- Asia > China > Hunan Province (0.04)
- Asia > China > Henan Province > Zhengzhou (0.04)