Textmining is an exciting topic. There is tremendous potential to gain insights from textual analysis. See for example Gentzko, Kelly and Taddy's Text as Data. While text mining may be quite advanced in other fields, in finance and economics the application of these techniques is still in its infancy. In order to take advantage of text as data, economists and financial analysts need tools to help them.
I have been meaning to get into quantitative text analysis for a while. I initially planned this post to feature a different package (that I wanted to showcase), however I ran into some problems with their .json The great thing about doing data science with R is that there are multiple avenues leading you to the same destination, so let's take advantage of that.
I saw Simon Jackson's recent blog post regarding ordering categories within facets. He proposed a way of dealing with the problem of ordering variables shared across facets within facets. This problem becomes apparent in text analysis where words are shared across facets but differ in frequency/magnitude ordering within each facet. Simon has provided a working approach but it feels awkward in that you are converting factors to numbers visually, adjusting spacing, and then putting the labels back on. I believe that there is a fairly straight forward tidy approach to deal with this problem.
The link to the session that Meetup will send you is a dummy link. For security reasons, I will send the real link to registered attendees last minute. Please do not share that link with others. Abstract: Visual representations of data inform how machine learning practitioners think, understand, and decide. Before charts are ever used for outward communication about a ML system, they are used by the system designers and operators themselves as a tool to make better modeling choices.