Bridging Compositional and Distributional Semantics: A Survey on Latent Semantic Geometry via AutoEncoder

Zhang, Yingji, Carvalho, Danilo S., Freitas, André

Aug-29-2025–arXiv.org Artificial Intelligence

Integrating compositional and symbolic properties into current distributional semantic spaces can enhance the interpretability, controllability, compositionality, and generalisation capabilities of Transformer-based auto-regressive language models (LMs). In this survey, we offer a novel perspective on latent space geometry through the lens of compositional semantics, a direction we refer to as \textit{semantic representation learning}. This direction enables a bridge between symbolic and distributional semantics, helping to mitigate the gap between them. We review and compare three mainstream autoencoder architectures-Variational AutoEncoder (VAE), Vector Quantised VAE (VQVAE), and Sparse AutoEncoder (SAE)-and examine the distinctive latent geometries they induce in relation to semantic structure and interpretability.

computational linguistic, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Aug-29-2025

arXiv.org PDF

Add feedback

Country:
- Europe (1.00)
- Asia > Middle East (0.67)
- North America > United States
  - Minnesota (0.28)

Genre:
- Research Report (1.00)
- Overview (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)