The Case for Scalable, Data-Driven Theory: A Paradigm for Scientific Progress in NLP