The Shape of Word Embeddings: Recognizing Language Phylogenies through Topological Data Analysis
Draganov, Ondřej, Skiena, Steven
–arXiv.org Artificial Intelligence
Comparing the Shapes of Word Embeddings Word embeddings are well-established objects of using Topological Data Analysis (TDA) - interest in natural language processing, being d-Through our experimental setup on language dimensional vector representations that capture the phylogeny reconstruction, we will test two semantics of each vocabulary word. The vocabulary related properties concerning the geometric of language L can thus be viewed as a cloud structure of word embeddings. First, we will of points, whose geometric and structural properties show that the shape of the unlabeled word encode considerable information about the embedding of a language carries information language. In this paper, we will demonstrate that, about its history and structure. Second, we even after disassociated from their bindings to particular show that this information can be at least partially words, the "shape" of these point clouds recovered through persistent homology, reflect the history of the languages they represent, a standard tool in TDA. by using techniques from topological data analysis (TDA), a field studying spatial aspects of data.
arXiv.org Artificial Intelligence
Mar-30-2024
- Country:
- Europe
- North America > United States (0.14)
- Genre:
- Research Report > New Finding (0.93)
- Technology: