How Important Is Size? An Investigation of Corpus Size and Meaning in Both Latent Semantic Analysis and Latent Dirichlet Allocation

Crossley, Scott (Georgia State University) | Dascalu, Mihai (University Politehnica of Bucharest) | McNamara, Danielle (Arizona State University)

AAAI Conferences 

This study examines how differences in corpus size influence the accuracy of Latent Semantic Analysis (LSA) spaces and Latent Dirichlet Allocation (LDA) spaces in two tasks: a word association task and a vocabulary definition test. Specific optimizations were considered in building each semantic model. Initial results indicate that larger corpora lead to greater accuracy and that LDA probabilistic models, similar to LSA vector spaces, can provide insights into cognitive processing at semantic levels.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found