CPU- and GPU-based Distributed Sampling in Dirichlet Process Mixtures for Large-scale Analysis
Dinari, Or, Zamir, Raz, Fisher, John W. III, Freifeld, Oren
In unsupervised learning, Bayesian Nonparametric (BNP) mixture models, exemplified by the Dirichlet-Process Mixture Model (DPMM), provide a principled approach for Bayesian modeling while adapting the model complexity to the data. This contrasts with finite mixture models whose complexity is determined manually or via model-selection methods. To fix ideas, an important DPMM example is the Dirichlet-Process Gaussian Mixture Model (DPGMM), a Bayesian -dimensional extension of the classical Gaussian Mixture Model (GMM). Despite their potential, however, and although researchers have used them successfully in numerous applications during the last two decades, DPMMs still do not enjoy wide popularity among practitioners, largely due to computational bottlenecks that exist in current algorithms and/or implementations. In particular, one of the missing pieces is the availability of software tools that: 1) can efficiently handle DPMM inference in large datasets; 2) are user-friendly and can also be easily modified. We argue that in order for DPMMs to become a practical choice for large-scale data analysis, implementations of DPMM inference must leverage parallel-and distributed-computing resources (in an analogy, consider how advances in GPU computing and GPU software contributed to the success of deep learning). This is because of not only potential speedups but also memory and storage considerations. For example, this is especially true in distributed mobile robotic sensing applications where multiple autonomous agents working together have limited computational and communication resources. As another motivating example, consider unsupervised dataanalysis tasks in large and high-dimensional computer-vision datasets.
Apr-19-2022
- Country:
- Asia > Middle East
- Israel > Southern District > Beer-Sheva (0.04)
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- North America > United States
- Massachusetts > Middlesex County > Cambridge (0.14)
- Asia > Middle East
- Genre:
- Research Report (0.64)