Suitability of CCA for Generating Latent State/ Variables in Multi-View Textual Data

Mehndiratta, Akanksha, Asawa, Krishna

arXiv.org Artificial Intelligence 

The probabilistic interpretation of Canonical Correlation Analysis (CCA) for learning low-dimensional real vectors, called as latent variables, has been exploited immensely in various fields. This study takes a step further by demonstrating the potential of CCA in discovering a latent state that captures the contextual information within the textual data under a two-view setting. The interpretation of CCA discussed in this study utilizes the multi-view nature of textual data, i.e. the consecutive sentences in a document or turns in a dyadic conversation, and has a strong theoretical foundation. Furthermore, this study proposes a model using CCA to perform the Automatic Short Answer Grading (ASAG) task. The empirical analysis confirms that the proposed model delivers competitive results and can even beat various sophisticated supervised techniques. The model is simple, linear, and adaptable and should be used as the baseline especially when labeled training data is scarce or nonexistent.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found