Data Quality May Be All You Need

Communications of the ACM 

History has a lesson for the development of artificial intelligence (AI): when in doubt, make it bigger. In "The Bitter Lesson," he argued that over its 70-year history, AI has succeeded when it has exploited available computing power. A series of papers published during the past decade that analyzed deep learning performance have confirmed the powerful effects of scaling up model size. This process accelerated in the wake of Google's development of the Transformer architecture for the BERT large language models (LLMs). Model size, measured by the number of stored neural weights, ballooned in just five years. From BERT's 340 million parameters, today's largest implementations, known as frontier models, such as OpenAI's GPT-4 have pushed beyond a trillion.