Stacked Density Estimation
Smyth, Padhraic, Wolpert, David
–Neural Information Processing Systems
One frequently estimates density functions for which there is little prior knowledge on the shape of the density and for which one wants a flexible and robust estimator (allowing multimodality if it exists). In this context, the methods of choice tend to be finite mixture models and kernel density estimation methods. For mixture modeling, mixtures of Gaussian components are frequently assumed and model choice reduces to the problem of choosing the number k of Gaussian components in the model (Titterington, Smith and Makov, 1986). For kernel density estimation, kernel shapes are typically chosen from a selection of simple unimodal densities such as Gaussian, triangular, or Cauchy densities, and kernel bandwidths are selected in a data-driven manner (Silverman 1986; Scott 1994). As argued by Draper (1996), model uncertainty can contribute significantly to pre- - Also with the Jet Propulsion Laboratory 525-3660, California Institute of Technology, Pasadena, CA 91109 Stacked Density Estimation 669 dictive error in estimation. While usually considered in the context of supervised learning, model uncertainty is also important in unsupervised learning applications such as density estimation. Even when the model class under consideration contains the true density, if we are only given a finite data set, then there is always a chance of selecting the wrong model. Moreover, even if the correct model is selected, there will typically be estimation error in the parameters of that model.
Neural Information Processing Systems
Dec-31-1998