'Small Data' Are Also Crucial for Machine Learning


When people hear "artificial intelligence," many envision "big data." There's a reason for that: Some of the most prominent AI breakthroughs in the past decade have relied on enormous data sets. Image classification made enormous strides in the 2010s thanks to the development of ImageNet, a data set containing millions of images hand sorted into thousands of categories. More recently, GPT-3, a language model that uses deep learning to produce humanlike text, benefited from training on hundreds of billions of words of online text. So it is not surprising to see AI being tightly connected with "big data" in the popular imagination.