The power of machine learning in Spark

#artificialintelligence 

One of the major differentiators between Apache Spark and the prior generation of Apache Hadoop–based and MapReduce-based technologies is the built-in Spark machine-learning library (MLlib). The motivation behind including these capabilities is to make practical machine learning scalable and understandable for data engineers and data scientists. MLlib also leverages Spark's distributed, in-memory execution model to yield significant performance benefits over preceding technologies such as R and Apache Mahout. While the capabilities in MLlib are powerful in the abstract, one still needs to identify a practical application, implement a technical solution and productionalize the analysis for its downstream consumers. As I discussed in the post, Spark: The operating system for big data analytics, Spark makes the implementation and productionalization of advanced data analysis significantly less challenging than the aforementioned technologies.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found