AITopics | laughter

Collaborating Authors

laughter

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Punchlines to Predictions: A Metric to Assess LLM Performance in Identifying Humor in Stand-Up Comedy

Romanowski, Adrianna, Valois, Pedro H. V., Fukui, Kazuhiro

arXiv.org Artificial IntelligenceNov-25-2025

Comedy serves as a profound reflection of the times we live in and is a staple element of human interactions. In light of the widespread adoption of Large Language Models (LLMs), the intersection of humor and AI has become no laughing matter. Advancements in the naturalness of human-computer interaction correlates with improvements in AI systems' abilities to understand humor. In this study, we assess the ability of models in accurately identifying humorous quotes from a stand-up comedy transcript. Stand-up comedy's unique comedic narratives make it an ideal dataset to improve the overall naturalness of comedic understanding. We propose a novel humor detection metric designed to evaluate LLMs amongst various prompts on their capability to extract humorous punchlines. The metric has a modular structure that offers three different scoring methods - fuzzy string matching, sentence embedding, and subspace similarity - to provide an overarching assessment of a model's performance. The model's results are compared against those of human evaluators on the same task. Our metric reveals that regardless of prompt engineering, leading models, ChatGPT, Claude, and DeepSeek, achieve scores of at most 51% in humor detection. Notably, this performance surpasses that of humans who achieve a score of 41%. The analysis of human evaluators and LLMs reveals variability in agreement, highlighting the subjectivity inherent in humor and the complexities involved in extracting humorous quotes from live performance transcripts. Code available at https://github.com/swaggirl9000/humor.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2025.cmcl-1.6

2504.09049

Country:

Asia (0.68)
Europe (0.68)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Why Are Kids So Funny?

The New YorkerSep-2-2025, 10:00:00 GMT

My daughter, Alice, is almost two, and quite funny. Although she can say short sentences--"I need cake!"--her humor isn't particularly verbal. Instead, she giggles while stumbling around in grownup shoes, or blows bubbles in her water when she should be drinking it. She likes to put on a hat, pull it down over her eyes, and then blunder around, arms outstretched, like a mummy. She's also discovered the humor of exaggeration: recently, when her brother resisted getting out of his pajamas in the morning, she sidled up, grabbed his shirt, hauled on it with both hands, and laughed while yelling, "Ooooouuuut!"

artificial intelligence, machine learning, natural language, (16 more...)

The New Yorker

Country: North America > United States > California (0.15)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.33)
Information Technology > Artificial Intelligence > Machine Learning (0.32)

Add feedback

Non-Verbal Vocalisations and their Challenges: Emotion, Privacy, Sparseness, and Real Life

Batliner, Anton, Amiriparian, Shahin, Schuller, Björn W.

arXiv.org Artificial IntelligenceAug-5-2025

Non-Verbal Vocalisations (NVVs) are short `non-word' utterances without proper linguistic (semantic) meaning but conveying connotations -- be this emotions/affects or other paralinguistic information. We start this contribution with a historic sketch: how they were addressed in psychology and linguistics in the last two centuries, how they were neglected later on, and how they came to the fore with the advent of emotion research. We then give an overview of types of NVVs (formal aspects) and functions of NVVs, exemplified with the typical NVV \textit{ah}. Interesting as they are, NVVs come, however, with a bunch of challenges that should be accounted for: Privacy and general ethical considerations prevent them of being recorded in real-life (private) scenarios to a sufficient extent. Isolated, prompted (acted) exemplars do not necessarily model NVVs in context; yet, this is the preferred strategy so far when modelling NVVs, especially in AI. To overcome these problems, we argue in favour of corpus-based approaches. This guarantees a more realistic modelling; however, we are still faced with privacy and sparse data problems.

machine learning, natural language, nvv, (17 more...)

arXiv.org Artificial Intelligence

2508.0196

Country:

North America > United States (1.00)
Asia (0.93)
Europe > United Kingdom > England (0.28)

Genre:

Overview (1.00)
Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.68)
(2 more...)

Add feedback

Millions of $\text{GeAR}$-s: Extending GraphRAG to Millions of Documents

Shen, Zhili, Diao, Chenxin, Merita, Pascual, Vougiouklis, Pavlos, Pan, Jeff Z.

arXiv.org Artificial IntelligenceJul-24-2025

Recent studies have explored graph-based approaches to retrieval-augmented generation, leveraging structured or semi-structured information -- such as entities and their relations extracted from documents -- to enhance retrieval. However, these methods are typically designed to address specific tasks, such as multi-hop question answering and query-focused summarisation, and therefore, there is limited evidence of their general applicability across broader datasets. In this paper, we aim to adapt a state-of-the-art graph-based RAG solution: $\text{GeAR}$ and explore its performance and limitations on the SIGIR 2025 LiveRAG Challenge.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.17399

Country: North America > United States > New York (0.29)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (1.00)
Health & Medicine (1.00)
Media > Film (0.96)
Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.50)

Add feedback

NonverbalTTS: A Public English Corpus of Text-Aligned Nonverbal Vocalizations with Emotion Annotations for Text-to-Speech

Borisov, Maksim, Spirin, Egor, Diatlova, Daria

arXiv.org Artificial IntelligenceJul-18-2025

Current expressive speech synthesis models are constrained by the limited availability of open-source datasets containing diverse nonverbal vocalizations (NVs). In this work, we introduce NonverbalTTS (NVTTS), a 17-hour open-access dataset annotated with 10 types of NVs (e.g., laughter, coughs) and 8 emotional categories. The dataset is derived from popular sources, VoxCeleb and Expresso, using automated detection followed by human validation. We propose a comprehensive pipeline that integrates automatic speech recognition (ASR), NV tagging, emotion classification, and a fusion algorithm to merge transcriptions from multiple annotators. Fine-tuning open-source text-to-speech (TTS) models on the NVTTS dataset achieves parity with closed-source systems such as CosyVoice2, as measured by both human evaluation and automatic metrics, including speaker similarity and NV fidelity. By releasing NVTTS and its accompanying annotation guidelines, we address a key bottleneck in expressive TTS research. The dataset is available at https://huggingface.co/datasets/deepvk/NonverbalTTS.

artificial intelligence, dataset, speech recognition, (17 more...)

arXiv.org Artificial Intelligence

2507.13155

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)

Add feedback

StandUp4AI: A New Multilingual Dataset for Humor Detection in Stand-up Comedy Videos

Barriere, Valentin, Gomez, Nahuel, Hemamou, Leo, Callejas, Sofia, Ravenet, Brian

arXiv.org Artificial IntelligenceMay-27-2025

Aiming towards improving current computational models of humor detection, we propose a new multimodal dataset of stand-up comedies, in seven languages: English, French, Spanish, Italian, Portuguese, Hungarian and Czech. Our dataset of more than 330 hours, is at the time of writing the biggest available for this type of task, and the most diverse. The whole dataset is automatically annotated in laughter (from the audience), and the subpart left for model validation is manually annotated. Contrary to contemporary approaches, we do not frame the task of humor detection as a binary sequence classification, but as word-level sequence labeling, in order to take into account all the context of the sequence and to capture the continuous joke tagging mechanism typically occurring in natural conversations. As par with unimodal baselines results, we propose a method for e propose a method to enhance the automatic laughter detection based on Audio Speech Recognition errors. Our code and data are available online: https://tinyurl.com/EMNLPHumourStandUpPublic

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.18903

Country:

South America > Chile (0.15)
Europe > France (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Why Do We Laugh? Annotation and Taxonomy Generation for Laughable Contexts in Spontaneous Text Conversation

Inoue, Koji, Elmers, Mikey, Lala, Divesh, Kawahara, Tatsuya

arXiv.org Artificial IntelligenceJan-27-2025

Laughter serves as a multifaceted communicative signal in human interaction, yet its identification within dialogue presents a significant challenge for conversational AI systems. This study addresses this challenge by annotating laughable contexts in Japanese spontaneous text conversation data and developing a taxonomy to classify the underlying reasons for such contexts. Initially, multiple annotators manually labeled laughable contexts using a binary decision (laughable or non-laughable). Subsequently, an LLM was used to generate explanations for the binary annotations of laughable contexts, which were then categorized into a taxonomy comprising ten categories, including "Empathy and Affinity" and "Humor and Surprise," highlighting the diverse range of laughter-inducing scenarios. The study also evaluated GPT-4's performance in recognizing the majority labels of laughable contexts, achieving an F1 score of 43.14%. These findings contribute to the advancement of conversational AI by establishing a foundation for more nuanced recognition and generation of laughter, ultimately fostering more natural and engaging human-AI interactions.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.16635

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
North America > United States > New York (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.36)

Add feedback

Reverse Prompt Engineering

Li, Hanqing, Klabjan, Diego

arXiv.org Artificial IntelligenceNov-25-2024

This paper explores a new black-box, zero-shot language model inversion problem and proposes an innovative framework for prompt reconstruction using only text outputs from a language model. Leveraging a large language model alongside an optimization algorithm, the proposed method effectively recovers prompts with minimal resources. Experimental results on several datasets derived from public sources indicate that the proposed approach achieves high-quality prompt recovery and generates prompts more similar to the originals than current state-of-the-art methods. Additionally, the use-case study demonstrates the method's strong potential for generating high-quality text data.

arxiv preprint arxiv, chorus, language model, (15 more...)

arXiv.org Artificial Intelligence

2411.06729

Country:

Europe > Greece (0.04)
North America > United States > Illinois > Cook County > Evanston (0.04)

Genre:

Instructional Material (1.00)
Research Report > Promising Solution (0.48)

Industry:

Marketing (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
Government > Military (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Analysis and Detection of Differences in Spoken User Behaviors between Autonomous and Wizard-of-Oz Systems

Elmers, Mikey, Inoue, Koji, Lala, Divesh, Ochi, Keiko, Kawahara, Tatsuya

arXiv.org Artificial IntelligenceOct-4-2024

This study examined users' behavioral differences in a large corpus of Japanese human-robot interactions, comparing interactions between a tele-operated robot and an autonomous dialogue system. We analyzed user spoken behaviors in both attentive listening and job interview dialogue scenarios. Results revealed significant differences in metrics such as speech length, speaking rate, fillers, backchannels, disfluencies, and laughter between operator-controlled and autonomous conditions. Furthermore, we developed predictive models to distinguish between operator and autonomous system conditions. Our models demonstrated higher accuracy and precision compared to the baseline model, with several models also achieving a higher F1 score than the baseline.

backchannel, job interview scenario, scenario, (13 more...)

arXiv.org Artificial Intelligence

2410.03147

Country:

North America > United States > Pennsylvania (0.04)
Asia > Singapore (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre: Research Report > Experimental Study (0.88)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech

Wu, Haibin, Wang, Xiaofei, Eskimez, Sefik Emre, Thakker, Manthan, Tompkins, Daniel, Tsai, Chung-Hsien, Li, Canrun, Xiao, Zhen, Zhao, Sheng, Li, Jinyu, Kanda, Naoyuki

arXiv.org Artificial IntelligenceJul-16-2024

People change their tones of voice, often accompanied by nonverbal vocalizations (NVs) such as laughter and cries, to convey rich emotions. However, most text-to-speech (TTS) systems lack the capability to generate speech with rich emotions, including NVs. This paper introduces EmoCtrl-TTS, an emotion-controllable zero-shot TTS that can generate highly emotional speech with NVs for any speaker. EmoCtrl-TTS leverages arousal and valence values, as well as laughter embeddings, to condition the flow-matching-based zero-shot TTS. To achieve high-quality emotional speech generation, EmoCtrl-TTS is trained using more than 27,000 hours of expressive data curated based on pseudo-labeling. Comprehensive evaluations demonstrate that EmoCtrl-TTS excels in mimicking the emotions of audio prompts in speech-to-speech translation scenarios. We also show that EmoCtrl-TTS can capture emotion changes, express strong emotions, and generate various NVs in zero-shot TTS. See https://aka.ms/emoctrl-tts for demo samples.

emoctrl-tts, emotion, speech, (15 more...)

arXiv.org Artificial Intelligence

2407.12229

Country:

North America > United States (0.14)
Asia > Taiwan (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.86)

Add feedback