The Initialization Determines Whether In-Context Learning Is Gradient Descent

Open in new window