Check out David Talby's tutorial "Natural language understanding at scale with spaCy and Spark NLP" at the Strata Data Conference in San Jose, March 5-8, 2018. Registration is now open--save 20% with the code BIGDATA20. Subscribe to the O'Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. When I first discovered and started using Apache Spark, a majority of the use cases I used it for involved unstructured text.