A Reduction for Efficient LDA Topic Reconstruction

Matteo Almanza, Flavio Chierichetti, Alessandro Panconesi, Andrea Vattani

May-26-2025, 10:49:07 GMT–Neural Information Processing Systems

We present a novel approach for LDA (Latent Dirichlet Allocation) topic reconstruction. The main technical idea is to show that the distribution over the documents generated by LDA can be transformed into a distribution for a much simpler generative model in which documents are generated from the same set of topics but have a much simpler structure: documents are single topic and topics are chosen uniformly at random. Furthermore, this reduction is approximation preserving, in the sense that approximate distributions -- the only ones we can hope to compute in practice -- are mapped into approximate distribution in the simplified world. This opens up the possibility of efficiently reconstructing LDA topics in a roundabout way. Compute an approximate document distribution from the given corpus, transform it into an approximate distribution for the single-topic world, and run a reconstruction algorithm in the uniform, single-topic world -- a much simpler task than direct LDA reconstruction. We show the viability of the approach by giving very simple algorithms for a generalization of two notable cases that have been studied in the literature, p-separability and matrix-like topics.

algorithm, artificial intelligence, natural language, (17 more...)

Neural Information Processing Systems

May-26-2025, 10:49:07 GMT

Conferences PDF

Add feedback

Country:
- North America
  - Canada > Quebec (0.14)
  - United States > California
    - San Francisco County > San Francisco (0.14)

Genre:
- Research Report (0.48)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.34)

Duplicate Docs Excel Report

Title
A Reduction for Efficient LDA Topic Reconstruction
A Reduction for Efficient LDA Topic Reconstruction

Similar Docs Excel Report more

Title	Similarity	Source
None found