Discovering Basic Emotion Sets via Semantic Clustering on a Twitter Corpus
–arXiv.org Artificial Intelligence
A plethora of words are used to describe the spectrum of human emotions, but how many emotions are there really, and how do they interact? Over the past few decades, several theories of emotion have been proposed, each based around the existence of a set of 'basic emotions', and each supported by an extensive variety of research including studies in facial expression, ethology, neurology and physiology. Here we present research based on a theory that people transmit their understanding of emotions through the language they use surrounding emotion keywords. Using a labelled corpus of over 21,000 tweets, six of the basic emotion sets proposed in existing literature were analysed using Latent Semantic Clustering (LSC), evaluating the distinctiveness of the semantic meaning attached to the emotional label. We hypothesise that the more distinct the language is used to express a certain emotion, then the more distinct the perception (including proprioception) of that emotion is, and thus more 'basic'. This allows us to select the dimensions best representing the entire spectrum of emotion. We find that Ekman's set, arguably the most frequently used for classifying emotions, is in fact the most semantically distinct overall. Next, taking all analysed (that is, previously proposed) emotion terms into account, we determine the optimal semantically irreducible basic emotion set using an iterative LSC algorithm. Our newly-derived set (Accepting, Ashamed, Contempt, Interested, Joyful, Pleased, Sleepy, Stressed) generates a 6.1% increase in distinctiveness over Ekman's set (Angry, Disgusted, Joyful, Sad, Scared). We also demonstrate how using LSC data can help visualise emotions. We introduce the concept of an Emotion Profile and briefly analyse compound emotions both visually and mathematically.
arXiv.org Artificial Intelligence
Dec-28-2012
- Country:
- Asia (0.92)
- Europe (0.92)
- North America > United States
- California > San Francisco County > San Francisco (0.14)
- Industry:
- Health & Medicine > Therapeutic Area
- Neurology (1.00)
- Psychiatry/Psychology > Mental Health (0.46)
- Information Technology > Services (0.92)
- Health & Medicine > Therapeutic Area
- Technology:
- Information Technology
- Artificial Intelligence
- Cognitive Science > Emotion (1.00)
- Machine Learning > Statistical Learning (1.00)
- Natural Language
- Discourse & Dialogue (0.92)
- Information Extraction (1.00)
- Text Processing (1.00)
- Representation & Reasoning (1.00)
- Communications > Social Media (1.00)
- Data Science > Data Mining (1.00)
- Artificial Intelligence
- Information Technology