Non-negative matrix factorization algorithms greatly improve topic model fits
Carbonetto, Peter, Sarkar, Abhishek, Wang, Zihao, Stephens, Matthew
We report on the potential for using algorithms for non-negative matrix factorization (NMF) to improve parameter estimation in topic models. While several papers have studied connections between NMF and topic models, none have suggested leveraging these connections to develop new algorithms for fitting topic models. Importantly, NMF avoids the "sum-to-one" constraints on the topic model parameters, resulting in an optimization problem with simpler structure and more efficient computations. Building on recent advances in optimization algorithms for NMF, we show that first solving the NMF problem then recovering the topic model fit can produce remarkably better fits, and in less time, than standard algorithms for topic models. While we focus primarily on maximum likelihood estimation, we show that this approach also has the potential to improve variational inference for topic models. Our methods are implemented in the R package fastTopics.
May-27-2021
- Country:
- South America > Paraguay
- North America > United States
- Pennsylvania > Philadelphia County
- Philadelphia (0.04)
- New York > New York County
- New York City (0.04)
- New Jersey > Hudson County
- Hoboken (0.04)
- Massachusetts > Middlesex County
- Belmont (0.04)
- Illinois > Cook County
- Chicago (0.04)
- California > Santa Clara County
- Palo Alto (0.04)
- Pennsylvania > Philadelphia County
- Europe
- Austria > Vienna (0.14)
- Spain > Canary Islands (0.04)
- Asia > Middle East
- Jordan (0.04)
- Genre:
- Research Report (1.00)
- Industry: