Using Transfer Learning for NLP with Small Data