Understanding the Expressive Power and Mechanisms of Transformer for Sequence Modeling
–Neural Information Processing Systems
Our study reveals the roles of critical parameters in the Transformer, such as the number of layers and the number of attention heads.
Neural Information Processing Systems
Oct-9-2025, 22:05:56 GMT
- Country:
- South America > Chile
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia
- Middle East > Jordan (0.04)
- China > Beijing
- Beijing (0.04)
- Genre:
- Research Report
- New Finding (1.00)
- Experimental Study (1.00)
- Research Report
- Technology: