pseudocoreset
Function Space Bayesian Pseudocoreset for Bayesian Neural Networks
A Bayesian pseudocoreset is a compact synthetic dataset summarizing essential information of a large-scale dataset and thus can be used as a proxy dataset for scalable Bayesian inference. Typically, a Bayesian pseudocoreset is constructed by minimizing a divergence measure between the posterior conditioning on the pseudocoreset and the posterior conditioning on the full dataset. However, evaluating the divergence can be challenging, particularly for the models like deep neural networks having high-dimensional parameters.
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
A Proofs
Evaluation The evaluation methods we used are summarized in Algorithm 2 and Algorithm 3. We summarize the hyperparameters used for our evaluations in Table 3. T able 3: Hyperparameters used for evaluations. B.2 Implementation details for BPC-rKL To obtain a Bayesian pseudocoreset with reverse KL divergence by Algorithm 1 in [19], we need to Require: Differentiable augmentation function A (Optional). Figure 1b, we show the accuracy with varying variances. 's are presented as colors. Table 5 shows additional results for the CIFAR10 dataset when the pseudocoreset size is larger. Even in these cases, BPC-W and BPC-fKL effectively generate Bayesian pseudocoresets.
ByPE-VAE: Bayesian Pseudocoresets Exemplar VAE
Recent studies show that advanced priors play a major role in deep generative models. Exemplar VAE, as a variant of VAE with an exemplar-based prior, has achieved impressive results. However, due to the nature of model design, an exemplar-based model usually requires vast amounts of data to participate in training, which leads to huge computational complexity. To address this issue, we propose Bayesian Pseudocoresets Exemplar VAE (ByPE-VAE), a new variant of VAE with a prior based on Bayesian pseudocoreset. The proposed prior is conditioned on a small-scale pseudocoreset rather than the whole dataset for reducing the computational cost and avoiding overfitting. Simultaneously, we obtain the optimal pseudocoreset via a stochastic optimization algorithm during VAE training aiming to minimize the Kullback-Leibler divergence between the prior based on the pseudocoreset and that based on the whole dataset. Experimental results show that ByPE-VAE can achieve competitive improvements over the state-of-the-art VAEs in the tasks of density estimation, representation learning, and generative data augmentation. Particularly, on a basic VAE architecture, ByPE-VAE is up to 3 times faster than Exemplar VAE while almost holding the performance. Code is available at \url{https://github.com/Aiqz/ByPE-VAE}.
On Divergence Measures for Bayesian Pseudocoresets
A Bayesian pseudocoreset is a small synthetic dataset for which the posterior over parameters approximates that of the original dataset. While promising, the scalability of Bayesian pseudocoresets is not yet validated in large-scale problems such as image classification with deep neural networks. On the other hand, dataset distillation methods similarly construct a small dataset such that the optimization with the synthetic dataset converges to a solution similar to optimization with full data. Although dataset distillation has been empirically verified in large-scale settings, the framework is restricted to point estimates, and their adaptation to Bayesian inference has not been explored. This paper casts two representative dataset distillation algorithms as approximations to methods for constructing pseudocoresets by minimizing specific divergence measures: reverse KL divergence and Wasserstein distance. Furthermore, we provide a unifying view of such divergence measures in Bayesian pseudocoreset construction. Finally, we propose a novel Bayesian pseudocoreset algorithm based on minimizing forward KL divergence. Our empirical results demonstrate that the pseudocoresets constructed from these methods reflect the true posterior even in large-scale Bayesian inference problems.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
A Derivation of Eq . 9
We already report the t-SNE visualization of ByPE-V AE and standard V AE in Figure. Figure 6: t-SNE visualization of learned latent representations, colored by labels. Second, we give more generated samples in Fig.8, among Figure 7: Random samples drawn from ByPE-V AEs trained on different datasets. Figure 8: Samples generated by ByPE-V AE based on the same pseudodata point in each plate. In section 5.2, We only report the KNN results of MNIST and Fashion MNIST in the Figure 1.
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Asia > China > Sichuan Province > Chengdu (0.04)
- Asia > China > Heilongjiang Province > Harbin (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Function Space Bayesian Pseudocoreset for Bayesian Neural Networks
A Bayesian pseudocoreset is a compact synthetic dataset summarizing essential information of a large-scale dataset and thus can be used as a proxy dataset for scalable Bayesian inference. Typically, a Bayesian pseudocoreset is constructed by minimizing a divergence measure between the posterior conditioning on the pseudocoreset and the posterior conditioning on the full dataset. However, evaluating the divergence can be challenging, particularly for the models like deep neural networks having high-dimensional parameters.
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)