One good thing about being stuck at home during the pandemic is that a person can finally get into the habit of listening to "A Way with Words," a radio show that airs on Friday afternoons on New York's WNYE (91.5 FM; check local listings). The hosts, Martha Barnette and Grant Barrett, are the Click and Clack of word talk. Barnette is a writer who has studied Latin and Greek (her books include "A Garden of Words"), and Barrett is a linguist and lexicographer with an ear for contemporary slang. They make a perfect duo. The show is modelled after "Car Talk," though it is broadcast from San Diego, not Cambridge: the hosts laugh a lot, and when people call in they answer by saying, "You have a way with words," which is always nice to hear.
There are a variety of tools that can help researchers analyze large volumes of written material. In this post, I'll examine two of these tools: part-of-speech tagging and tone analysis. I'll also show how to use these methods to find patterns in a large set of Facebook posts created by members of Congress. Part-of-speech (POS) tagging is a process that labels each word in a sentence with an algorithm's best guess for the word's part of speech (for example, noun, adjective or verb). This is based on both the definition of each word and the context in which it appears.
One of the main concerns with AI technologies today is the fear that they will propagate the various biases we already have in society. A recent Stanford study turned things around, however, and highlighted how AI can also turn the mirror onto society and shed light on the biases that exist within it. The study utilized word embeddings to map relationships and associations between words and, through that measure, the changes in gender and ethnic stereotypes over the last century in the United States. The algorithms were fed text from a huge canon of books, newspapers, and other texts, while comparing these with official census demographic data and societal changes, such as the women's movement. The researchers used embedding to single out specific occupations and adjectives that tended to be biased toward women or ethnic groups each decade from 1900 to the present day.
Fast mapping is a phenomenon by which children learn the meanings of novel adjectives after a very small number of exposures when the new word is contrasted with a known word. The present study was a preliminary test of whether machine learners could use such contrasts in unconstrained speech to learn adjective meanings and categories. Six decision tree-based learning methods were evaluated that use contrasting examples in order to work toward an adjective fast-mapping system for machine learners. Subjects tended to compare objects using adjectives of the same category, implying that such contrasts may be a useful source of data about adjective meaning, though none of the learning algorithms showed strong advantages over any other.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the creators of WordNet and do not necessarily reflect the views of any funding agency or Princeton University. When writing a paper or producing a software application, tool, or interface based on WordNet, it is necessary to properly cite the source. Citation figures are critical to WordNet funding. WordNet is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept.