Features-based embedding or Feature-grounding
–arXiv.org Artificial Intelligence
Pre-trained language models such as BERT [3] have become foundational in modern natural language processing, owing to their ability to capture rich contextual representations from large-scale corpora. However, it is still unclear where and how extracted knowledge from training data is internally represented in the model, and how we can distribute this knowledge between structurally similar models. This work introduces a specific method for word-embedding initialization that encapsulates domain-specific knowledge into internal representations to construct feature-grounded embed-dings. Such kind of embedding provides structured prior into internal weight landscape during LLM training.
arXiv.org Artificial Intelligence
Jul-1-2025