Email Insights from Data Science -- Part 2
A detailed method for extracting sentiment and alignment information from corporate email content. Part 3 -- Shows a method of unsupervised-to-supervised feature extraction. In Part 1 of this series I demonstrated a method for extracting email contents from a proprietary repository in preparation for analysis and further data exploration. In this part I will focus on analysis and rating of the extracted information to determine usability for building a supervised modeling dataset. Currently, the data we retrieved from the Enron repository is still in its raw, but mostly clean and filtered form. This means the dataset is unstructured and unfocused for the tasks we are solving for. Since our goals are to classify the email contents to determine overall company sentiment (negative/positive) and alignment with company objectives, I'll need to transform the unstructured texts into a supervised dataset that we will use to train a recurrent network.
Oct-21-2021, 19:55:07 GMT