Geometric Signatures of Compositionality Across a Language Model's Lifetime
Lee, Jin Hwa, Jiralerspong, Thomas, Yu, Lei, Bengio, Yoshua, Cheng, Emily
–arXiv.org Artificial Intelligence
Compositionality, the notion that the meaning of an expression is constructed from the meaning of its parts and syntactic rules, permits the infinite productivity of human language. For the first time, artificial language models (LMs) are able to match human performance in a number of compositional generalization tasks. However, much remains to be understood about the representational mechanisms underlying these abilities. We take a high-level geometric approach to this problem by relating the degree of compositionality in a dataset to the intrinsic dimensionality of its representations under an LM, a measure of feature complexity. We find not only that the degree of dataset compositionality is reflected in representations' intrinsic dimensionality, but that the relationship between compositionality and geometric complexity arises due to learned linguistic features over training. Finally, our analyses reveal a striking contrast between linear and nonlinear dimensionality, showing that they respectively encode formal and semantic aspects of linguistic composition.
arXiv.org Artificial Intelligence
Oct-7-2024
- Country:
- Africa > Sudan (0.04)
- South America > Colombia
- Meta Department > Villavicencio (0.04)
- North America
- Dominican Republic (0.04)
- United States > Washington
- King County > Seattle (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- Canada
- Europe
- United Kingdom > England
- Greater London > London (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Netherlands > South Holland
- The Hague (0.04)
- Latvia > Lubāna Municipality
- Lubāna (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- United Kingdom > England
- Asia
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Industry:
- Education (0.93)
- Health & Medicine > Therapeutic Area
- Neurology (0.46)
- Technology: