Comparative Analysis of Document-Level Embedding Methods for Similarity Scoring on Shakespeare Sonnets and Taylor Swift Lyrics

Kramer, Klara

arXiv.org Artificial Intelligence 

Document similarity assessment plays an important role in various natural language processing (NLP) applications, such as information retrieval, plagiarism detection, recommendation systems, and question answering [11, 19]. For instance, in recommendation systems, document similarity helps personalise suggestions by finding content that closely matches user preference. These tasks rely on accurate measurements of how similar documents are in terms of their structure, content, and meaning, which depends on the way the document is represented computationally. This representation is usually done in vector format and is obtained via document embedding methods. V arious methodologies can be employed to obtain document-level embeddings, and the choice of method directly impacts the accuracy and usefulness of the similarity scores calculated [14, 19].

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found