Hierarchically Supervised Latent Dirichlet Allocation
Perotte, Adler J., Wood, Frank, Elhadad, Noemie, Bartlett, Nicholas
–Neural Information Processing Systems
We introduce hierarchically supervised latent Dirichlet allocation (HSLDA), a model for hierarchically and multiply labeled bag-of-word data. Examples of such data include web pages and their placement in directories, product descriptions and associated categories from product hierarchies, and free-text clinical records and their assigned diagnosis codes. Out-of-sample label prediction is the primary goal of this work, but improved lower-dimensional representations of the bag-of-word data are also of interest. We demonstrate HSLDA on large-scale data from clinical document labeling and retail product categorization tasks. We show that leveraging the structure from hierarchical labels improves out-of-sample label prediction substantially when compared to models that do not.
Neural Information Processing Systems
Dec-31-2011
- Country:
- North America > United States (0.28)
- Genre:
- Research Report > New Finding (0.46)
- Industry:
- Technology: