
The assumption that data are independently and identically distributed (IID) is staple in statistical machine learning. It suggests that a hypothesis selected by an algorithm, after observing several training samples, should perform effectively on test samples from the same unknown distribution.