Spark Summit West 2016 was held on June 7th-8th in San Francisco. While a lot of it was about the latest and greatest that's coming in Apache Spark 2.0, there was still quite a lot of useful information about the current Spark 1.6 version and ML/DL. Also, my team at Salesforce presented "A graph-based method for cross-entity threat detection" at the Spark Summit. Checkout Slides Video to learn more. Most of the sessions that I attended were in the developer track and here are some notes from those.
Today is Day 2 of the three-day Spark Summit event in San Francisco. As I reported yesterday, MapR and Microsoft have already made Spark distribution-related announcements timed for the event. Today, it's IBM's turn, as the company has announced a new Spark development environment. And, going back to yesterday, there were Spark connector announcements from Couchbase and Snowflake Computing that I wasn't able to cover. IBM, Spark and R IBM, who, you may recall, made a splashy announcement, around a $300M investment in Spark support, at least year's Spark Summit, today announced a major software deliverable from that initiative.
It shouldn't be surprising given the media spotlight on artificial intelligence, but AI will be all over the keynote and session schedule for this year's Spark Summit. The irony, of course, is that while Spark has become known as a workhorse for data engineering workloads, its original claim to fame was that it put machine learning on the same engine as SQL, streaming, and graph. But Spark has also had its share of impedance mismatch issues, such as making R and Python programs first-class citizens, or adapting to more compute-intensive processing of AI models. Of course, that hasn't stopped adventurous souls from breaking new ground. Hold those thoughts for a moment.