Predicting Breast Cancer Using Apache Spark Machine Learning Logistic Regression

#artificialintelligence 

In this blog post, I'll help you get started using Apache Spark's spark.ml Classification is a family of supervised machine learning algorithms that identify which category an item belongs to (for example, whether a cancer tissue observation is malignant or not), based on labeled examples of known items (for example, observations known to be malignant or not). Classification takes a set of data with known labels and pre-determined features and learns how to label new records based on that information. Features are the "if questions" that you ask. The label is the answer to those questions.