wordplay
Welcome to the Slopverse
Listen to more stories on the Noa app. Bill Lowery, a sales executive, is confused when a workmate asks where he should take a date out for dinosaur. "You're planning to take this girl out for?" "That's right," the colleague responds, totally nonchalant. Lowery presses him, agitated: "Wait a minute. What is this, some sort of new-wave expression or something--saying instead of?" "He's so pale and awfully congested--and he didn't touch his dinosaur when I took it in to him."
Pun Unintended: LLMs and the Illusion of Humor Understanding
Zangari, Alessandro, Marcuzzo, Matteo, Albarelli, Andrea, Pilehvar, Mohammad Taher, Camacho-Collados, Jose
Puns are a form of humorous wordplay that exploits polysemy and phonetic similarity. While LLMs have shown promise in detecting puns, we show in this paper that their understanding often remains shallow, lacking the nuanced grasp typical of human interpretation. By systematically analyzing and reformulating existing pun benchmarks, we demonstrate how subtle changes in puns are sufficient to mislead LLMs. Our contributions include comprehensive and nuanced pun detection benchmarks, human evaluation across recent LLMs, and an analysis of the robustness challenges these models face in processing puns.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States > Florida > Miami-Dade County > Miami (0.14)
- Europe > Switzerland (0.04)
- (9 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
Pun Intended: Multi-Agent Translation of Wordplay with Contrastive Learning and Phonetic-Semantic Embeddings
Taylor, Russell, Herbert, Benjamin, Sana, Michael
Translating wordplay across languages presents unique challenges that have long confounded both professional human translators and machine translation systems. This research proposes a novel approach for translating puns from English to French by combining state-of-the-art large language models with specialized techniques for wordplay generation. Our methodology employs a three-stage approach. First, we establish a baseline using multiple frontier large language models with feedback based on a new contrastive learning dataset. Second, we implement a guided chain-of-thought pipeline with combined phonetic-semantic embeddings. Third, we implement a multi-agent generator-discriminator framework for evaluating and regenerating puns with feedback. Moving beyond the limitations of literal translation, our methodology's primary objective is to capture the linguistic creativity and humor of the source text wordplay, rather than simply duplicating its vocabulary. Our best runs earned first and second place in the CLEF JOKER 2025 Task 2 competition where they were evaluated manually by expert native French speakers. This research addresses a gap between translation studies and computational linguistics by implementing linguistically-informed techniques for wordplay translation, advancing our understanding of how language models can be leveraged to handle the complex interplay between semantic ambiguity, phonetic similarity, and the implicit cultural and linguistic awareness needed for successful humor.
- North America > United States > Georgia > Fulton County > Atlanta (0.14)
- Europe > Spain > Galicia > Madrid (0.04)
- Research Report > Promising Solution (0.48)
- Overview > Innovation (0.48)
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
A Reasoning-Based Approach to Cryptic Crossword Clue Solving
Andrews, Martin, Witteveen, Sam
Cryptic crossword clues are challenging language tasks for which new test sets are released daily by major newspapers on a global basis. Each cryptic clue contains both the definition of the answer to be placed in the crossword grid (in common with regular crosswords), and 'wordplay' that proves that the answer is correct (i.e. a human solver can be confident that an answer is correct without needing crossing words as confirmation). This work describes an LLM-based reasoning system built from open-licensed components that solves cryptic clues by (i) hypothesising answers; (ii) proposing wordplay explanations; and (iii) using a verifier system that operates on codified reasoning steps. Overall, this system establishes a new state-of-the-art performance on the challenging Cryptonite dataset of clues from The Times and The Telegraph newspapers in the UK. Because each proved solution is expressed in Python, interpretable wordplay reasoning for proven answers is available for inspection.
- Europe > United Kingdom (0.34)
- North America > United States (0.28)
- North America > Dominican Republic (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
KoWit-24: A Richly Annotated Dataset of Wordplay in News Headlines
Baranov, Alexander, Palatkina, Anna, Makovka, Yulia, Braslavski, Pavel
We present KoWit-24, a dataset with fine-grained annotation of wordplay in 2,700 Russian news headlines. KoWit-24 annotations include the presence of wordplay, its type, wordplay anchors, and words/phrases the wordplay refers to. Unlike the majority of existing humor collections of canned jokes, KoWit-24 provides wordplay contexts -- each headline is accompanied by the news lead and summary. The most common type of wordplay in the dataset is the transformation of collocations, idioms, and named entities -- the mechanism that has been underrepresented in previous humor datasets. Our experiments with five LLMs show that there is ample room for improvement in wordplay detection and interpretation tasks. The dataset and evaluation scripts are available at https://github.com/Humor-Research/KoWit-24
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (11 more...)
Proving that Cryptic Crossword Clue Answers are Correct
Andrews, Martin, Witteveen, Sam
Cryptic crossword clues are challenging cognitive tasks, for which new test sets are released on a daily basis by multiple international newspapers. Each cryptic clue contains both the definition of the answer to be placed in the crossword grid (in common with regular crosswords), and `wordplay' that proves that the answer is correct (i.e. a human solver can be confident that an answer is correct without needing crossing words to confirm it). Using an existing cryptic wordplay proving framework (operating on Python proofs created by an LLM), we show that it is possible to distinguish between correct answers and almost-correct ones based upon whether the wordplay `works'.
- Europe > Austria > Vienna (0.14)
- Oceania > Australia (0.04)
- North America > United States > New York (0.04)
- (4 more...)
Are LLMs Good Cryptic Crossword Solvers?
Sadallah, Abdelrahman "Boda", Kotova, Daria, Kochmar, Ekaterina
Cryptic crosswords are puzzles that rely not only on general knowledge but also on the solver's ability to manipulate language on different levels and deal with various types of wordplay. Previous research suggests that solving such puzzles is a challenge even for modern NLP models. However, the abilities of large language models (LLMs) have not yet been tested on this task. In this paper, we establish the benchmark results for three popular LLMs -- LLaMA2, Mistral, and ChatGPT -- showing that their performance on this task is still far from that of humans.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models
Jentzsch, Sophie, Kersting, Kristian
Humor is a central aspect of human communication that has not been solved for artificial agents so far. Large language models (LLMs) are increasingly able to capture implicit and contextual information. Especially, OpenAI's ChatGPT recently gained immense public attention. The GPT3-based model almost seems to communicate on a human level and can even tell jokes. Humor is an essential component of human communication. But is ChatGPT really funny? We put ChatGPT's sense of humor to the test. In a series of exploratory experiments around jokes, i.e., generation, explanation, and detection, we seek to understand ChatGPT's capability to grasp and reproduce human humor. Since the model itself is not accessible, we applied prompt-based experiments. Our empirical evidence indicates that jokes are not hard-coded but mostly also not newly generated by the model. Over 90% of 1008 generated jokes were the same 25 Jokes. The system accurately explains valid jokes but also comes up with fictional explanations for invalid jokes. Joke-typical characteristics can mislead ChatGPT in the classification of jokes. ChatGPT has not solved computational humor yet but it can be a big leap toward "funny" machines.
- Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Cologne (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)
GPT-4 is surprisingly good at explaining jokes
Explaining a joke, as E.B. White once wrote, is like dissecting a frog: "the thing dies in the process and the innards are discouraging to any but the purely scientific mind." In fact, the large language model -- released on March 14 by OpenAI -- is surprisingly good at generating detailed explanations of why a joke is funny. And like its predecessor, ChatGPT, the AI can also generate jokes, though its go-to one-liners are simple and seem to have been scraped from the internet's corniest, punniest corners (Why don't scientists trust atoms? Because they make up everything!). GPT-4 seems better at explaining humor than its predecessor.
Witscript 2: A System for Generating Improvised Jokes Without Wordplay
A previous paper presented Witscript, a system for generating conversational jokes that rely on wordplay. This paper extends that work by presenting Witscript 2, which uses a large language model to generate conversational jokes that rely on common sense instead of wordplay. Like Witscript, Witscript 2 is based on joke-writing algorithms created by an expert comedy writer. Human evaluators judged Witscript 2's responses to input sentences to be jokes 46% of the time, compared to 70% of the time for human-written responses. This is evidence that Witscript 2 represents another step toward giving a chatbot a humanlike sense of humor.
- Europe > Switzerland (0.06)
- North America > United States > New York > Westchester County > Rye (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > West Midlands > Birmingham (0.04)