L$^2$M: Mutual Information Scaling Law for Long-Context Language Modeling