excitement
EmoGist: Efficient In-Context Learning for Visual Emotion Understanding
In this paper, we introduce EmoGist, a training-free, in-context learning method for performing visual emotion classification with LVLMs. The key intuition of our approach is that context-dependent definition of emotion labels could allow more accurate predictions of emotions, as the ways in which emotions manifest within images are highly context dependent and nuanced. EmoGist pre-generates multiple descriptions of emotion labels, by analyzing the clusters of example images belonging to each label. At test time, we retrieve a version of description based on the cosine similarity of test image to cluster centroids, and feed it together with the test image to a fast LVLM for classification. Through our experiments, we show that EmoGist allows up to 12 points improvement in micro F1 scores with the multi-label Memotion dataset, and up to 8 points in macro F1 in the multi-class FI dataset.
- Asia > Thailand > Bangkok > Bangkok (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- North America > Canada > British Columbia > Vancouver (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Vision (0.95)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.49)
'Minecraft' movie mayhem raises alarms for America's youth, 'bad for society': expert
"A Minecraft Movie," the big-screen adaptation of the popular video game "Minecraft," has been packing theaters with rowdy kids and teens since its release this month, spurring a social media phenomenon and sparking concern for America's youth. Videos on social media show young theatergoers huge reactions to one key scene, where one of the film's stars, Jack Black, yells out the phrase "Chicken Jockey!" as a small, Frankenstein-looking creature lands on top of a chicken in a boxing ring to face off with co-star Jason Momoa. The scene has prompted excited fans to scream, shout, throw popcorn around, jump up out of their seats, and in one instance in Provo, Utah, toss a live chicken in the air during a screening, according to the Salt Lake Tribune. Springs Cinema & Taphouse in Sandy Springs, Georgia, told FOX 5 Atlanta that its staff has had to clean up popcorn, ICEEs, ketchup and shattered glass. The scene featuring the "Chicken Jockey" in "A Minecraft Movie" has spawned some chaotic movie theater behavior from young audiences. "The movie-going experience has changed a lot since I was younger," Josh Gunderson, director of marketing and events at Oviedo Mall in Florida, told FOX Business.
- North America > United States > Utah > Utah County > Provo (0.25)
- North America > United States > Georgia > Fulton County > Sandy Springs (0.25)
- Media > Film (1.00)
- Leisure & Entertainment > Games > Computer Games (1.00)
SCI-IDEA: Context-Aware Scientific Ideation Using Token and Sentence Embeddings
Keya, Farhana, Rabby, Gollam, Mitra, Prasenjit, Vahdati, Sahar, Auer, Sören, Jaradeh, Yaser
Every scientific discovery starts with an idea inspired by prior work, interdisciplinary concepts, and emerging challenges. Recent advancements in large language models (LLMs) trained on scientific corpora have driven interest in AI-supported idea generation. However, generating context-aware, high-quality, and innovative ideas remains challenging. We introduce SCI-IDEA, a framework that uses LLM prompting strategies and Aha Moment detection for iterative idea refinement. SCI-IDEA extracts essential facets from research publications, assessing generated ideas on novelty, excitement, feasibility, and effectiveness. Comprehensive experiments validate SCI-IDEA's effectiveness, achieving average scores of 6.84, 6.86, 6.89, and 6.84 (on a 1-10 scale) across novelty, excitement, feasibility, and effectiveness, respectively. Evaluations employed GPT-4o, GPT-4.5, DeepSeek-32B (each under 2-shot prompting), and DeepSeek-70B (3-shot prompting), with token-level embeddings used for Aha Moment detection. Similarly, it achieves scores of 6.87, 6.86, 6.83, and 6.87 using GPT-4o under 5-shot prompting, GPT-4.5 under 3-shot prompting, DeepSeek-32B under zero-shot chain-of-thought prompting, and DeepSeek-70B under 5-shot prompting with sentence-level embeddings. We also address ethical considerations such as intellectual credit, potential misuse, and balancing human creativity with AI-driven ideation. Our results highlight SCI-IDEA's potential to facilitate the structured and flexible exploration of context-aware scientific ideas, supporting innovation while maintaining ethical standards.
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Health Care Technology > Telehealth (1.00)
- Energy (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (0.92)
Black Myth: Wukong – the summer's most exciting, and most controversial, video game
When Chinese developer Game Science revealed its debut console game Black Myth: Wukong last year, it immediately caused a stir. Inspired by the great 16th-century Chinese novel, Journey to the West, the action-packed footage featured the titular mythological monkey Sun Wukong battling Buddhist-folklore demons and sword-wielding anthropomorphic foxes in lusciously rendered forests. Smartphone games are inordinately popular in China, but console game developers are still few and far between, and the excitement for Wukong in Game Science's homeland reached fever pitch. Within 24 hours, the trailer racked up 2m views on YouTube and more than 10m on Chinese video sharing site Bilibili, much to its creators' shock and delight. One excited fan even broke into the developer's office, desperate for more info on the game.
- Information Technology > Communications (0.77)
- Information Technology > Artificial Intelligence > Games (0.41)
EmoCAM: Toward Understanding What Drives CNN-based Emotion Recognition
Doulfoukar, Youssef, Mertens, Laurent, Vennekens, Joost
Convolutional Neural Networks are particularly suited for image analysis tasks, such as Image Classification, Object Recognition or Image Segmentation. Like all Artificial Neural Networks, however, they are "black box" models, and suffer from poor explainability. This work is concerned with the specific downstream task of Emotion Recognition from images, and proposes a framework that combines CAM-based techniques with Object Detection on a corpus level to better understand on which image cues a particular model, in our case EmoNet, relies to assign a specific emotion to an image. We demonstrate that the model mostly focuses on human characteristics, but also explore the pronounced effect of specific image modifications.
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.06)
- North America > United States > New York > New York County > New York City (0.04)
Large Vision-Language Models as Emotion Recognizers in Context Awareness
Lei, Yuxuan, Yang, Dingkang, Chen, Zhaoyu, Chen, Jiawei, Zhai, Peng, Zhang, Lihua
Context-aware emotion recognition (CAER) is a complex and significant task that requires perceiving emotions from various contextual cues. Previous approaches primarily focus on designing sophisticated architectures to extract emotional cues from images. However, their knowledge is confined to specific training datasets and may reflect the subjective emotional biases of the annotators. Furthermore, acquiring large amounts of labeled data is often challenging in real-world applications. In this paper, we systematically explore the potential of leveraging Large Vision-Language Models (LVLMs) to empower the CAER task from three paradigms: 1) We fine-tune LVLMs on two CAER datasets, which is the most common way to transfer large models to downstream tasks. 2) We design zero-shot and few-shot patterns to evaluate the performance of LVLMs in scenarios with limited data or even completely unseen. In this case, a training-free framework is proposed to fully exploit the In-Context Learning (ICL) capabilities of LVLMs. Specifically, we develop an image similarity-based ranking algorithm to retrieve examples; subsequently, the instructions, retrieved examples, and the test example are combined to feed LVLMs to obtain the corresponding sentiment judgment. 3) To leverage the rich knowledge base of LVLMs, we incorporate Chain-of-Thought (CoT) into our framework to enhance the model's reasoning ability and provide interpretable results. Extensive experiments and analyses demonstrate that LVLMs achieve competitive performance in the CAER task across different paradigms. Notably, the superior performance in few-shot settings indicates the feasibility of LVLMs for accomplishing specific tasks without extensive training.
- Health & Medicine (0.67)
- Leisure & Entertainment > Sports (0.46)
Generative AI Doesn't Make Hardware Less Hard
After years of development, startup Humane launched a 700 wearable in early April that leans heavily on artificial intelligence. The original pitch for the Ai Pin was that you no longer need to juggle different apps; its operating system can "search for the right AI at the right moment," allowing it to play music, translate languages, and even tell you how much protein is in a palmful of almonds. And because it doesn't have a traditional display, the Ai pin was supposed to be a tiny tincture for the disease of screentime; smartphones were on their way out. The pin has been panned. WIRED's Julian Chokkattu scored the Ai Pin a 4 out of 10. Popular YouTuber Marques Brownlee complimented the device's hardware design but still called it "The Worst Product I've Ever Reviewed … For Now." The company has since massaged the message that it's meant to replace your phone.
Why has Nvidia driven stock markets to record highs?
Investor excitement over artificial intelligence reached a new peak this week when better-than-expected results from chipmaker Nvidia drove stock markets in three continents to record highs. The rally began on Thursday and continued into Friday, as Nvidia overtook Google's parent group, Alphabet, to become the third most valuable company in the US. Its market capitalisation hit 2tn ( 1.58tn), surpassed only by Microsoft and Apple. The artificial intelligence (AI) boom has raised many questions, not least over safety and the impact on jobs, but there are also concerns that it might be driving unsustainable market exuberance. Here we look at the latest share price rise and whether it can be maintained.
- North America > United States > New York > New York County > New York City (0.06)
- Europe > Ukraine (0.05)
- Europe > Middle East (0.05)
- (4 more...)
OpenAI's Sora Is a Total Mystery
Yesterday afternoon, OpenAI teased Sora, a video-generation model that promises to convert written text prompts into highly realistic videos. Footage released by the company depicts such examples as "a Shiba Inu dog wearing a beret and black turtleneck" and "in an ornate, historical hall, a massive tidal wave peaks and begins to crash." The excitement from the press has been reminiscent of the buzz surrounding the image creator DALL-E or ChatGPT in 2022: Sora is described as "eye-popping," "world-changing," and "breathtaking, yet terrifying." The imagery is genuinely impressive. At a glance, one example of an animated "fluffy monster" looks better than Shrek; an "extreme close up" of a woman's eye, complete with a reflection of the scene in front of her, is startlingly lifelike.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)
Cybercrime, AI supremacy and the metaverse: the tech stories that will dominate 2024
Partway through 2023, I caught up with a respected, high-ranking tech writer at another publication. We gossiped and nattered, and, a bit exasperated, empathised with each other: we were run ragged. The last two years have raised the stakes for what tech journalists do from serving a small niche community to covering stories that have an impact on the wider world. It's also down to the characters involved and what's at stake. Tech journalists have lived on fast-forward since Elon Musk first lodged his bid to take over Twitter – now X – in April 2022.
- Media > News (0.71)
- Information Technology > Security & Privacy (0.66)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.51)