Data Distributional Properties Drive Emergent In-Context Learning in Transformers