Stochastic Ratio Matching of RBMs for Sparse High-Dimensional Inputs
–Neural Information Processing Systems
Sparse high-dimensional data vectors are common in many application domains where a very large number of rarely non-zero features can be devised. Unfortunately, this creates a computational bottleneck for unsupervised feature learning algorithms such as those based on auto-encoders and RBMs, because they involve a reconstruction step where the whole input vector is predicted from the current feature values. An algorithm was recently developed to successfully handle the case of auto-encoders, based on an importance sampling scheme stochastically selecting which input elements to actually reconstruct during training for each particular example. To generalize this idea to RBMs, we propose a stochastic ratio-matching algorithm that inherits all the computational advantages and unbiasedness of the importance sampling scheme. We show that stochastic ratio matching is a good estimator, allowing the approach to beat the state-of-the-art on two bag-of-word text classification benchmarks (20 Newsgroups and RCV1), while keeping computational cost linear in the number of non-zeros.
Neural Information Processing Systems
Mar-13-2024, 17:50:49 GMT
- Country:
- North America
- United States > New York
- New York County > New York City (0.04)
- Canada > Quebec
- Montreal (0.04)
- United States > New York
- North America
- Technology: