Machine Learning using Spark and R - Dataconomy
R is ubiquitous in the machine learning community. Its ecosystem of more than 8,000 packages makes it the Swiss Army knife of modeling applications. Similarly, Apache Spark has rapidly become the big data platform of choice for data scientists. Its ability to perform calculations relatively quickly (due to features like in-memory caching) makes it ideal for interactive tasks--such as exploratory data analysis. R (SparkR) is the latest addition and support for it certainly lags the other three languages. In Spark 1.x there was no support for accessing the Spark ML (machine learning) libraries from R. The performance of R code on Spark was also considerably worse than could be achieved using, say, Scala.
Aug-28-2017, 01:05:26 GMT
- Technology: