Spark analytics applications boosted by built-in libraries
At last year's Spark Summit conference, Patrick Wendell, a software engineer at Databricks Inc. and a contributor to the Apache Spark open source project, said the technology's data processing capabilities are impressive but its real power lies in the Spark library components that sit on top of the core engine. "The future of Spark is the libraries," he said. "That's what the community has invested in and where the innovation is coming from." Sure enough, this month's Spark Summit 2015 event prominently featured case studies in which users explained how they're putting the libraries to work in Spark analytics applications. The Spark platform comes with four distinct libraries -- Spark SQL, Spark Streaming, a graph processing library called GraphX and a machine learning one known as MLlib -- that include pre-built algorithms and programming capabilities designed to streamline data preparation, exploration and analysis tasks. The libraries enable users to automate certain tasks and eliminate some of the coding that typically would be required.
Oct-9-2016, 16:46:04 GMT
- Industry:
- Leisure & Entertainment (0.99)
- Media > Television (0.50)
- Information Technology > Software (0.35)
- Technology: