Suitability of CCA for Generating Latent State/ Variables in Multi-View Textual Data

Jun-18-2024–arXiv.org Artificial Intelligence

The probabilistic interpretation of Canonical Correlation Analysis (CCA) for learning low-dimensional real vectors, called as latent variables, has been exploited immensely in various fields. This study takes a step further by demonstrating the potential of CCA in discovering a latent state that captures the contextual information within the textual data under a two-view setting. The interpretation of CCA discussed in this study utilizes the multi-view nature of textual data, i.e. the consecutive sentences in a document or turns in a dyadic conversation, and has a strong theoretical foundation. Furthermore, this study proposes a model using CCA to perform the Automatic Short Answer Grading (ASAG) task. The empirical analysis confirms that the proposed model delivers competitive results and can even beat various sophisticated supervised techniques. The model is simple, linear, and adaptable and should be used as the baseline especially when labeled training data is scarce or nonexistent.

cca, interpretation, latent state, (15 more...)

arXiv.org Artificial Intelligence

Jun-18-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California > San Francisco County > San Francisco (0.04)
- Asia
  - Middle East > Jordan (0.05)
  - India > Uttar Pradesh (0.04)

Genre:
- Research Report (0.84)

Industry:
- Law (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Text Processing (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found