SparkR (R on Spark) - Spark 1.6.0 Documentation
SparkR is an R package that provides a light-weight frontend to use Apache Spark from R. In Spark 1.6.0, SparkR provides a distributed data frame implementation that supports operations like selection, filtering, aggregation etc. (similar to R data frames, dplyr) but on large datasets. SparkR also supports distributed machine learning using MLlib. A DataFrame is a distributed collection of data organized into named columns. It is conceptually equivalent to a table in a relational database or a data frame in R, but with richer optimizations under the hood.
Mar-23-2016, 23:20:35 GMT
- Technology: