Transformers Learn to Achieve Second-Order Convergence Rates for In-Context Linear Regression

Neural Information Processing Systems 

Transformers learn to approximate second-order optimization methods for ICL.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found