Ranking-Enhanced Unsupervised Sentence Representation Learning

Seonwoo, Yeon, Wang, Guoyin, Seo, Changmin, Choudhary, Sajal, Li, Jiwei, Li, Xiang, Xu, Puyang, Park, Sunghyun, Oh, Alice

May-18-2023–arXiv.org Artificial Intelligence

Unsupervised sentence representation learning has progressed through contrastive learning and data augmentation methods such as dropout masking. Despite this progress, sentence encoders are still limited to using only an input sentence when predicting its semantic vector. In this work, we show that the semantic meaning of a sentence is also determined by nearest-neighbor sentences that are similar to the input sentence. Based on this finding, we propose a novel unsupervised sentence encoder, RankEncoder. RankEncoder predicts the semantic vector of an input sentence by leveraging its relationship with other sentences in an external corpus, as well as the input sentence itself. We evaluate RankEncoder on semantic textual benchmark datasets. From the experimental results, we verify that 1) RankEncoder Figure 1: Vector representations of sentences and their achieves 80.07% Spearman's correlation, neighbor sentences. The neighbor sentences reveal that a 1.1% absolute improvement compared (a, c) share more semantic meanings than (a, b). This to the previous state-of-the-art performance, 2) captures more accurate semantic similarity scores than RankEncoder is universally applicable to existing their vectors.

artificial intelligence, natural language, text processing, (18 more...)

arXiv.org Artificial Intelligence

May-18-2023

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning (1.00)
  - Natural Language > Text Processing (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found