Lexinvariant Language Models Eric Zelikman Gregory Valiant

Feb-8-2025, 23:16:25 GMT–Neural Information Processing Systems

Token embeddings, a mapping from discrete lexical symbols to continuous vectors, are at the heart of any language model (LM). However, lexical symbol meanings can also be determined and even redefined by their structural role in a long context. In this paper, we ask: is it possible for a language model to be performant without any fixed token embeddings? Such a language model would have to rely entirely on the co-occurence and repetition of tokens in the context rather than the a priori identity of any token. To answer this, we study lexinvariant language models that are invariant to lexical symbols and therefore do not need fixed token embeddings in practice.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Feb-8-2025, 23:16:25 GMT

Conferences PDF

Add feedback