Predicting Breast Cancer Using Apache Spark Machine Learning Logistic Regression
Then we use another map transformation, which will apply the ParseObs function to transform each Array of Double in the RDD into an Array of Cancer Observation objects. The toDF() method transforms the RDD of Array[[Cancer Observation]] into a Dataframe with the Cancer Observation class schema. Below the data is split into a training data set and a test data set, 70% of the data is used to train the model, and 30% will be used for testing. In this blog post, we showed you how to get started using Apache Spark's machine learning Logistic Regression for classification.
Oct-19-2016, 21:21:22 GMT
- Industry:
- Health & Medicine > Therapeutic Area
- Oncology (0.42)
- Obstetrics/Gynecology (0.42)
- Genetic Disease (0.42)
- Health & Medicine > Therapeutic Area
- Technology: