Advancing Text Mining with R and quanteda
The data that we usually use for text analysis is available in text formats (e.g., .txt After reading in the data, we need to generate a corpus. A corpus is a type of dataset that is used in text analysis. It contains "a collection of text or speech material that has been brought together according to a certain set of predetermined criteria" (Shmelova et al. 2019, p. 33). These criteria are usually set by the researchers and are in concordance with the guiding question.
Nov-12-2019, 16:07:47 GMT
- Technology: