Sentiment Analysis with Scikit-learn and GCP

#artificialintelligence 

For this project, I wanted to design a model that would do a simple classification of whether a phrase is positive or negative. Since I'm only looking for a binary result, I chose to use Sklearn's logistic regression module. If you were trying to predict more than two labels, you would have to use a different ML model. The data used is a corpus of 5,000 movie reviews -- 2,500 positive and 2,500 negative. The model has an accuracy of 90% and probably performs better with text that is similar to a review because it would more like the training data.