CSE-SFP: Enabling Unsupervised Sentence Representation Learning via a Single Forward Pass
Zhang, Bowen, Song, Zixin, Li, Chunping
–arXiv.org Artificial Intelligence
As a fundamental task in Information Retrieval and Computational Linguistics, sentence representation has profound implications for a wide range of practical applications such as text clustering, content analysis, question-answering systems, and web search. Recent advances in pre-trained language models (PLMs) have driven remarkable progress in this field, particularly through unsupervised embedding derivation methods centered on discriminative PLMs like BERT. However, due to time and computational constraints, few efforts have attempted to integrate unsupervised sentence representation with generative PLMs, which typically possess much larger parameter sizes. Given that state-of-the-art models in both academia and industry are predominantly based on generative architectures, there is a pressing need for an efficient unsupervised text representation framework tailored to decoder-only PLMs. To address this concern, we propose CSE-SFP, an innovative method that exploits the structural characteristics of generative models. Compared to existing strategies, CSE-SFP requires only a single forward pass to perform effective unsupervised contrastive learning. Rigorous experimentation demonstrates that CSE-SFP not only produces higher-quality embeddings but also significantly reduces both training time and memory consumption. Furthermore, we introduce two ratio metrics that jointly assess alignment and uniformity, thereby providing a more robust means for evaluating the semantic spatial properties of encoding models.
arXiv.org Artificial Intelligence
May-2-2025
- Country:
- Asia
- China
- Beijing > Beijing (0.04)
- Henan Province > Zhengzhou (0.04)
- Hong Kong (0.04)
- Tianjin Province > Tianjin (0.04)
- Japan > Kyūshū & Okinawa
- Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
- Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Singapore (0.04)
- China
- Europe
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy (0.05)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Spain > Galicia
- Madrid (0.04)
- Switzerland (0.04)
- Croatia > Dubrovnik-Neretva County
- North America
- Canada
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Dominican Republic (0.04)
- United States
- California > Los Angeles County
- Long Beach (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York > New York County
- New York City (0.04)
- Washington > King County
- Seattle (0.04)
- California > Los Angeles County
- Canada
- Asia
- Genre:
- Research Report > Promising Solution (0.86)
- Technology: