emoji
Language-Independent Sentiment Labelling with Distant Supervision: A Case Study for English, Sepedi and Setswana
Mabokela, Koena Ronny, Schlippe, Tim, Raborife, Mpho, Celik, Turgay
Sentiment analysis is a helpful task to automatically analyse opinions and emotions on various topics in areas such as AI for Social Good, AI in Education or marketing. While many of the sentiment analysis systems are developed for English, many African languages are classified as low-resource languages due to the lack of digital language resources like text labelled with corresponding sentiment classes. One reason for that is that manually labelling text data is time-consuming and expensive. Consequently, automatic and rapid processes are needed to reduce the manual effort as much as possible making the labelling process as efficient as possible. In this paper, we present and analyze an automatic language-independent sentiment labelling method that leverages information from sentiment-bearing emojis and words. Our experiments are conducted with tweets in the languages English, Sepedi and Setswana from SAfriSenti, a multilingual sentiment corpus for South African languages. We show that our sentiment labelling approach is able to label the English tweets with an accuracy of 66%, the Sepedi tweets with 69%, and the Setswana tweets with 63%, so that on average only 34% of the automatically generated labels remain to be corrected.
- North America > United States (0.04)
- Europe > Germany (0.04)
- Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
- Africa > South Africa > Gauteng > Johannesburg (0.04)
A Computer Science Professor Invented the Emoticon After a Joke Went Wrong
In 1982, Carnegie Mellon University professor Scott Fahlman suggested using:-) for humorous comments after his colleagues took a joke about mercury seriously. On September 19, 1982, Carnegie Mellon University computer science research assistant professor Scott Fahlman posted a message to the university's bulletin board software that would later come to shape how people communicate online. His proposal: use:-) and:-( as markers to distinguish jokes from serious comments. While Fahlman describes himself as "the inventor or at least one of the inventors" of what would later be called the smiley face emoticon, the full story reveals something more interesting than a lone genius moment. The whole episode started three days earlier when computer scientist Neil Swartz posed a physics problem to colleagues on Carnegie Mellon's "bboard," which was an early online message board.
- North America > United States > North Carolina > Wake County > Raleigh (0.05)
- North America > United States > California (0.05)
- Europe > Ukraine (0.05)
- (2 more...)
- Education > Educational Setting (0.34)
- Government (0.30)
- Information Technology > Communications (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.49)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
- Information Technology > Artificial Intelligence > Machine Learning (0.48)
EMODIS: A Benchmark for Context-Dependent Emoji Disambiguation in Large Language Models
Huang, Jiacheng, Yu, Ning, Yi, Xiaoyin
Large language models (LLMs) are increasingly deployed in real-world communication settings, yet their ability to resolve context-dependent ambiguity remains underexplored. In this work, we present EMODIS, a new benchmark for evaluating LLMs' capacity to interpret ambiguous emoji expressions under minimal but contrastive textual contexts. Each instance in EMODIS comprises an ambiguous sentence containing an emoji, two distinct disambiguating contexts that lead to divergent interpretations, and a specific question that requires contextual reasoning. We evaluate both open-source and API-based LLMs, and find that even the strongest models frequently fail to distinguish meanings when only subtle contextual cues are present. Further analysis reveals systematic biases toward dominant interpretations and limited sensitivity to pragmatic contrast. EMODIS provides a rigorous testbed for assessing contextual disambiguation, and highlights the gap in semantic reasoning between humans and LLMs.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Thailand > Bangkok > Bangkok (0.05)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (4 more...)
How Do VLAs Effectively Inherit from VLMs?
Zhang, Chuheng, Yang, Rushuai, Chen, Xiaoyu, Wang, Kaixin, Zhao, Li, Chen, Yi, Bian, Jiang
Vision-language-action (VLA) models hold the promise to attain generalizable embodied control. To achieve this, a pervasive paradigm is to leverage the rich vision-semantic priors of large vision-language models (VLMs). However, the fundamental question persists: How do VLAs effectively inherit the prior knowledge from VLMs? To address this critical question, we introduce a diagnostic benchmark, GrinningFace, an emoji tabletop manipulation task where the robot arm is asked to place objects onto printed emojis corresponding to language instructions. This task design is particularly revealing -- knowledge associated with emojis is ubiquitous in Internet-scale datasets used for VLM pre-training, yet emojis themselves are largely absent from standard robotics datasets. Consequently, they provide a clean proxy: successful task completion indicates effective transfer of VLM priors to embodied control. We implement this diagnostic task in both simulated environment and a real robot, and compare various promising techniques for knowledge transfer. Specifically, we investigate the effects of parameter-efficient fine-tuning, VLM freezing, co-training, predicting discretized actions, and predicting latent actions. Through systematic evaluation, our work not only demonstrates the critical importance of preserving VLM priors for the generalization of VLA but also establishes guidelines for future research in developing truly generalizable embodied AI systems.
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.34)
Semantic Journeys: Quantifying Change in Emoji Meaning from 2012-2018
Robertson, Alexander, Liza, Farhana Ferdousi, Nguyen, Dong, McGillivray, Barbara, Hale, Scott A.
The semantics of emoji has, to date, been considered from a static perspective. We offer the first longitudinal study of how emoji semantics changes over time, applying techniques from computational linguistics to six years of Twitter data. We identify five patterns in emoji semantic development and find evidence that the less abstract an emoji is, the more likely it is to undergo semantic change. In addition, we analyse select emoji in more detail, examining the effect of seasonal-ity and world events on emoji semantics. To aid future work on emoji and semantics, we make our data publicly available along with a web-based interface that anyone can use to explore semantic change in emoji.
- Information Technology > Services (0.68)
- Leisure & Entertainment > Sports > Basketball (0.46)
When Smiley Turns Hostile: Interpreting How Emojis Trigger LLMs' Toxicity
Cui, Shiyao, Feng, Xijia, Wang, Yingkang, Yang, Junxiao, Zhang, Zhexin, Sikdar, Biplab, Wang, Hongning, Qiu, Han, Huang, Minlie
Emojis are globally used non-verbal cues in digital communication, and extensive research has examined how large language models (LLMs) understand and utilize emojis across contexts. While usually associated with friendliness or playfulness, it is observed that emojis may trigger toxic content generation in LLMs. Motivated by such a observation, we aim to investigate: (1) whether emojis can clearly enhance the toxicity generation in LLMs and (2) how to interpret this phenomenon. We begin with a comprehensive exploration of emoji-triggered LLM toxicity generation by automating the construction of prompts with emojis to subtly express toxic intent. Experiments across 5 mainstream languages on 7 famous LLMs along with jailbreak tasks demonstrate that prompts with emojis could easily induce toxicity generation. To understand this phenomenon, we conduct model-level interpretations spanning semantic cognition, sequence generation and tokenization, suggesting that emojis can act as a heterogeneous semantic channel to bypass the safety mechanisms. To pursue deeper insights, we further probe the pre-training corpus and uncover potential correlation between the emoji-related data polution with the toxicity generation behaviors. Supplementary materials provide our implementation code and data.
- Europe > Austria > Vienna (0.14)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- North America > United States (0.04)
- (3 more...)
7 brand-new emojis are coming to your keyboard soon
When you purchase through links in our articles, we may earn a small commission. These new symbols are expected to start appearing in emoji keyboards towards the end of 2025. Yesterday, in an announcement post, the Unicode Consortium unveiled the latest version of its standard--with 7 brand-new emojis along with 156 new variants for existing emojis. The new batch of emojis recommended for inclusion in standard emoji keyboards include: a "distorted face," a "fight cloud" as seen in comics and cartoons, an "orca" or "killer whale", a "hairy creature" reminiscent of Bigfoot, a "trombone," a "landslide," and a "treasure chest." On top of these brand-new emojis, the remaining emoji changes include gender-neutral ballet dancers in different skin tones, as well as more skin tone variants for existing codes like the "people with bunny ears" and "people wrestling" emojis.
Understanding Textual Emotion Through Emoji Prediction
Gordon, Ethan, Kuppa, Nishank, Tummala, Rigved, Anasuri, Sriram
This project explores emoji prediction from short text sequences using four deep learning architectures: a feed-forward network, CNN, transformer, and BERT. Using the TweetEval dataset, we address class imbalance through focal loss and regularization techniques. Results show BERT achieves the highest overall performance due to it's pre-training advantage, while CNN demonstrates superior efficacy on rare emoji classes. This research shows the importance of architecture selection and hyperparameter tuning for sentiment-aware emoji prediction, contributing to improved human-computer interaction.
Reasoning Beyond Labels: Measuring LLM Sentiment in Low-Resource, Culturally Nuanced Contexts
Ochieng, Millicent, Thieme, Anja, Ezeani, Ignatius, Ueno, Risa, Maina, Samuel, Ronen, Keshet, Gonzalez, Javier, O'Neill, Jacki
Sentiment analysis in low-resource, culturally nuanced contexts challenges conventional NLP approaches that assume fixed labels and universal affective expressions. We present a diagnostic framework that treats sentiment as a context-dependent, culturally embedded construct, and evaluate how large language models (LLMs) reason about sentiment in informal, code-mixed WhatsApp messages from Nairobi youth health groups. Using a combination of human-annotated data, sentiment-flipped counterfactuals, and rubric-based explanation evaluation, we probe LLM interpretability, robustness, and alignment with human reasoning. Framing our evaluation through a social-science measurement lens, we operationalize and interrogate LLMs outputs as an instrument for measuring the abstract concept of sentiment. Our findings reveal significant variation in model reasoning quality, with top-tier LLMs demonstrating interpretive stability, while open models often falter under ambiguity or sentiment shifts. This work highlights the need for culturally sensitive, reasoning-aware AI evaluation in complex, real-world communication.
- Africa > Kenya > Nairobi City County > Nairobi (0.25)
- North America > United States (0.04)
- Asia > Singapore (0.04)
- (4 more...)
The Prosody of Emojis
Zhou, Giulio, Lam, Tsz Kin, Birch, Alexandra, Haddow, Barry
Prosodic features such as pitch, timing, and intonation are central to spoken communication, conveying emotion, intent, and discourse structure. In text-based settings, where these cues are absent, emojis act as visual surrogates that add affective and pragmatic nuance. This study examines how emojis influence prosodic realisation in speech and how listeners interpret prosodic cues to recover emoji meanings. Unlike previous work, we directly link prosody and emoji by analysing actual human speech data, collected through structured but open-ended production and perception tasks. This provides empirical evidence of how emoji semantics shape spoken delivery and perception. Results show that speakers adapt their prosody based on emoji cues, listeners can often identify the intended emoji from prosodic variation alone, and greater semantic differences between emojis correspond to increased prosodic divergence. These findings suggest that emojis can act as meaningful carriers of prosodic intent, offering insight into their communicative role in digitally mediated contexts.
- Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.40)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (6 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Communications > Social Media (0.93)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)