Applied Text Analysis with Python: Enabling Language-Aware Data Products with Machine Learning: Benjamin Bengfort, Rebecca Bilbro, Tony Ojeda: 9781491963043: Amazon.com: Books
In this book, we focus on applied machine learning for text analysis using the Python libraries just described. The applied nature of the book means that we focus not on the academic nature of linguistics or statistical models, but instead on how to be effective at deploying models trained on text inside of a software application. The model for text analysis we propose is directly related to the machine learning workflow--a search process to find a model composed of features, an algorithm, and hyperparameters that best operates on training data to produce estimations on unknown data. This workflow starts with the construction and management of a training dataset, called a corpus in text analysis. We will then explore feature extraction and preprocessing methodologies to compose text as numeric data that machine learning can understand. With some basic features in hand, we explore techniques for classification and clustering on text, concluding the first few chapters of the book.
Dec-28-2018, 08:05:05 GMT