Language Model Behavioral Phases are Consistent Across Architecture, Training Data, and Scale

Open in new window