[Introduction to Special Issue] Prediction and its limits


A major challenge for using data to make predictions is distinguishing what is meaningful from noise. The image represents one approach that visually indicates the complexity of the problem by highlighting some links in a network and deleting other possible links, with the hole indicating the more meaningful information. We have tried to predict the future since ancient times when shamans looked for patterns in smoking entrails. As this special section explores, prediction is now a developing science. Essays probe such questions as how to allocate limited resources, whether a country will descend into conflict, and who will likely win an election or publish a high-impact paper, as well as looking at how standards should develop in this emerging field.

On the Cover-Hart Inequality: What's a Sample of Size One Worth? Machine Learning

Bob predicts a future observation based on a sample of size one. Alice can draw a sample of any size before issuing her prediction. How much better can she do than Bob? Perhaps surprisingly, under a large class of loss functions, which we refer to as the Cover-Hart family, the best Alice can do is to halve Bob's risk. In this sense, half the information in an infinite sample is contained in a sample of size one. The Cover-Hart family is a convex cone that includes metrics and negative definite functions, subject to slight regularity conditions. These results may help explain the small relative differences in empirical performance measures in applied classification and forecasting problems, as well as the success of reasoning and learning by analogy in general, and nearest neighbor techniques in particular.