AITopics | memorability

Collaborating Authors

memorability

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Predicting Visual

Neural Information Processing SystemsFeb-10-2026, 22:47:55 GMT

artificial intelligence, machine learning, oliva, (15 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.05)
North America > United States > New York (0.05)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.69)

Add feedback

From Pixels to Feelings: Aligning MLLMs with Human Cognitive Perception of Images

Chen, Yiming, Han, Junlin, Bai, Tianyi, Tong, Shengbang, Kokkinos, Filippos, Torr, Philip

arXiv.org Artificial IntelligenceDec-1-2025

While Multimodal Large Language Models (MLLMs) are adept at answering what is in an image-identifying objects and describing scenes-they often lack the ability to understand how an image feels to a human observer. This gap is most evident when considering subjective cognitive properties, such as what makes an image memorable, funny, aesthetically pleasing, or emotionally evocative. To systematically address this challenge, we introduce CogIP-Bench, a comprehensive benchmark for evaluating MLLMs on such image cognitive properties. Our evaluation reveals a significant gap: current models are poorly aligned with human perception of these nuanced properties. We then demonstrate that a post-training phase can effectively bridge this gap, significantly enhancing the model's alignment with human judgments. Furthermore, we show that this learned cognitive alignment is not merely predictive but also transferable to downstream creative tasks. By integrating our cognitively-aligned MLLM into an image generation pipeline, we can guide the synthesis process to produce images that better embody desired traits, such as being more memorable or visually appealing. Our work provides a benchmark to measure this human-like perception, a post-training pipeline to enhance it, and a demonstration that this alignment unlocks more human-centric AI.

dimension, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2511.22805

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > New York (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > China (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unsupervised Memorability Modeling from Tip-of-the-Tongue Retrieval Queries

Bhattacharyya, Sree, Singla, Yaman Kumar, Yarram, Sudhir, Singh, Somesh Kumar, S, Harini I, Wang, James Z.

arXiv.org Artificial IntelligenceNov-27-2025

Visual content memorability has intrigued the scientific community for decades, with applications ranging widely, from understanding nuanced aspects of human memory to enhancing content design. A significant challenge in progressing the field lies in the expensive process of collecting memorability annotations from humans. This limits the diversity and scalability of datasets for modeling visual content memorability. Most existing datasets are limited to collecting aggregate memorability scores for visual content, not capturing the nuanced memorability signals present in natural, open-ended recall descriptions. In this work, we introduce the first large-scale unsupervised dataset designed explicitly for modeling visual memorability signals, containing over 82,000 videos, accompanied by descriptive recall data. We leverage tip-of-the-tongue (ToT) retrieval queries from online platforms such as Reddit. We demonstrate that our unsupervised dataset provides rich signals for two memorability-related tasks: recall generation and ToT retrieval. Large vision-language models fine-tuned on our dataset outperform state-of-the-art models such as GPT-4o in generating open-ended memorability descriptions for visual content. We also employ a contrastive training strategy to create the first model capable of performing multimodal ToT retrieval. Our dataset and models present a novel direction, facilitating progress in visual content memorability research.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2511.20854

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > Pennsylvania (0.04)
Asia > India (0.04)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.67)

Industry:

Media (1.00)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

LLM-based Fusion of Multi-modal Features for Commercial Memorability Prediction

Pramov, Aleksandar

arXiv.org Artificial IntelligenceOct-28-2025

This paper addresses the prediction of commercial (brand) memorability as part of "Subtask 2: Commercial/Ad Memorability" within the "Memorability: Predicting movie and commercial memorability" task at the MediaEval 2025 workshop competition. We propose a multimodal fusion system with a Gemma-3 LLM backbone that integrates pre-computed visual (ViT) and textual (E5) features by multi-modal projections. The model is adapted using Low-Rank Adaptation (LoRA). A heavily-tuned ensemble of gradient boosted trees serves as a baseline. A key contribution is the use of LLM-generated rationale prompts, grounded in expert-derived aspects of memorability, to guide the fusion model. The results demonstrate that the LLM-based system exhibits greater robustness and generalization performance on the final test set, compared to the baseline. The paper's codebase can be found at https://github.com/dsgt-arc/mediaeval-2025-memorability

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2510.22829

Country:

Europe > Slovenia > Central Slovenia > Municipality of Dobrepolje > Videm (0.05)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Interpretable Mnemonic Generation for Kanji Learning via Expectation-Maximization

Lee, Jaewook, Scarlatos, Alexander, Lan, Andrew

arXiv.org Artificial IntelligenceSep-1-2025

Learning Japanese vocabulary is a challenge for learners from Roman alphabet backgrounds due to script differences. Japanese combines syllabaries like hiragana with kanji, which are logographic characters of Chinese origin. Kanji are also complicated due to their complexity and volume. Keyword mnemonics are a common strategy to aid memorization, often using the compositional structure of kanji to form vivid associations. Despite recent efforts to use large language models (LLMs) to assist learners, existing methods for LLM-based keyword mnemonic generation function as a black box, offering limited interpretability. We propose a generative framework that explicitly models the mnemonic construction process as driven by a set of common rules, and learn them using a novel Expectation-Maximization-type algorithm. Trained on learner-authored mnemonics from an online platform, our method learns latent structures and compositional rules, enabling interpretable and systematic mnemonics generation. Experiments show that our method performs well in the cold-start setting for new learners while providing insight into the mechanisms behind effective mnemonic creation.

large language model, learner, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.05137

Country:

North America > United States > Florida > Miami-Dade County > Miami (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Predicting Event Memorability from Contextual Visual Semantics

Neural Information Processing SystemsAug-17-2025, 03:27:42 GMT

Episodic event memory is a key component of human cognition.

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.05)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

MindMem: Multimodal for Predicting Advertisement Memorability Using LLMs and Deep Learning

Asgarian, Sepehr, Jetha, Qayam, Jeon, Jouhyun

arXiv.org Artificial IntelligenceFeb-25-2025

In the competitive landscape of advertising, success hinges on effectively navigating and leveraging complex interactions among consumers, advertisers, and advertisement platforms. These multifaceted interactions compel advertisers to optimize strategies for modeling consumer behavior, enhancing brand recall, and tailoring advertisement content. To address these challenges, we present MindMem, a multimodal predictive model for advertisement memorability. By integrating textual, visual, and auditory data, MindMem achieves state-of-the-art performance, with a Spearman's correlation coefficient of 0.631 on the LAMBDA and 0.731 on the Memento10K dataset, consistently surpassing existing methods. Furthermore, our analysis identified key factors influencing advertisement memorability, such as video pacing, scene complexity, and emotional resonance. Expanding on this, we introduced MindMem-ReAd (MindMem-Driven Re-generated Advertisement), which employs Large Language Model-based simulations to optimize advertisement content and placement, resulting in up to a 74.12% improvement in advertisement memorability. Our results highlight the transformative potential of Artificial Intelligence in advertising, offering advertisers a robust tool to drive engagement, enhance competitiveness, and maximize impact in a rapidly evolving market.

advertisement, memorability, memorability score, (14 more...)

arXiv.org Artificial Intelligence

2502.18371

Genre: Research Report > New Finding (1.00)

Industry:

Marketing (1.00)
Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The exception of humour: Iconicity, Phonemic Surprisal, Memory Recall, and Emotional Associations

Kilpatrick, Alexander, Flaksman, Maria

arXiv.org Artificial IntelligenceFeb-2-2025

This meta-study explores the relationships between humor, phonemic bigram surprisal, emotional valence, and memory recall. Prior research indicates that words with higher phonemic surprisal are more readily remembered, suggesting that unpredictable phoneme sequences promote long-term memory recall. Emotional valence is another well-documented factor influencing memory, with negative experiences and stimuli typically being remembered more easily than positive ones. Building on existing findings, this study highlights that words with negative associations often exhibit greater surprisal and are easier to recall. Humor, however, presents an exception: while associated with positive emotions, humorous words also display heightened surprisal and enhanced memorability.

artificial intelligence, machine learning, surprisal, (15 more...)

arXiv.org Artificial Intelligence

2502.01682

Country:

North America > Canada (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > Japan (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

A Dataset and Baselines for Measuring and Predicting the Music Piece Memorability

Tseng, Li-Yang, Lin, Tzu-Ling, Shuai, Hong-Han, Huang, Jen-Wei, Chang, Wen-Whei

arXiv.org Artificial IntelligenceMay-21-2024

Nowadays, humans are constantly exposed to music, whether through voluntary streaming services or incidental encounters during commercial breaks. Despite the abundance of music, certain pieces remain more memorable and often gain greater popularity. Inspired by this phenomenon, we focus on measuring and predicting music memorability. To achieve this, we collect a new music piece dataset with reliable memorability labels using a novel interactive experimental procedure. We then train baselines to predict and analyze music memorability, leveraging both interpretable features and audio mel-spectrograms as inputs. To the best of our knowledge, we are the first to explore music memorability using data-driven deep learning-based methods. Through a series of experiments and ablation studies, we demonstrate that while there is room for improvement, predicting music memorability with limited data is possible. Certain intrinsic elements, such as higher valence, arousal, and faster tempo, contribute to memorable music. As prediction techniques continue to evolve, real-life applications like music recommendation systems and music style transfer will undoubtedly benefit from this new area of research.

memorability, music, music memorability, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.5281/zenodo.10265251

2405.12847

Country:

North America > United States > New York (0.04)
Asia > Taiwan (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Asia > China (0.04)

Genre: Research Report (0.50)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)

Add feedback

LLaVA Finds Free Lunch: Teaching Human Behavior Improves Content Understanding Abilities Of LLMs

Singh, Somesh, S, Harini I, Singla, Yaman K, Baths, Veeky, Shah, Rajiv Ratn, Chen, Changyou, Krishnamurthy, Balaji

arXiv.org Artificial IntelligenceMay-16-2024

Communication is defined as "Who says what to whom with what effect." A message from a communicator generates downstream receiver effects, also known as behavior. Receiver behavior, being a downstream effect of the message, carries rich signals about it. Even after carrying signals about the message, the behavior data is often ignored while training large language models. We show that training LLMs on receiver behavior can actually help improve their content-understanding abilities. Specifically, we show that training LLMs to predict the receiver behavior of likes and comments improves the LLM's performance on a wide variety of downstream content understanding tasks. We show this performance increase over 40 video and image understanding tasks over 23 benchmark datasets across both 0-shot and fine-tuning settings, outperforming many supervised baselines. Moreover, since receiver behavior, such as likes and comments, is collected by default on the internet and does not need any human annotations to be useful, the performance improvement we get after training on this data is essentially free-lunch. We release the receiver behavior cleaned comments and likes of 750k images and videos collected from multiple platforms along with our instruction-tuning data.

behavior-llava, dataset, video, (14 more...)

arXiv.org Artificial Intelligence

2405.00942

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
South America > Argentina (0.05)
Europe > France (0.04)
(5 more...)

Genre: Research Report (0.40)

Industry:

Media (1.00)
Government (0.93)
Automobiles & Trucks (0.68)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback