charlie chaplin
V-HUB: A Visual-Centric Humor Understanding Benchmark for Video LLMs
Shi, Zhengpeng, Li, Hengli, Zhao, Yanpeng, Zhou, Jianqun, Wang, Yuxuan, Cui, Qinrong, Bi, Wei, Zhu, Songchun, Zhao, Bo, Zheng, Zilong
AI models capable of comprehending humor hold real-world promise -- for example, enhancing engagement in human-machine interactions. To gauge and diagnose the capacity of multimodal large language models (MLLMs) for humor understanding, we introduce v-HUB, a novel visual-centric video humor understanding benchmark. v-HUB comprises a curated collection of minimally verbal short videos, sourced from classic silent films and online resources, and reflecting real-world scenarios where humor can be appreciated purely through visual cues. Each video clip is paired with rich annotations, including captions, descriptions, and explanations, supporting evaluation tasks like caption matching and humor explanation. To broaden its applicability, we further construct an open-ended video QA task, making it readily integrable into existing video understanding benchmarks. We evaluate a diverse set of MLLMs, from specialized Video-LLMs to versatile OmniLLMs that can process audio, covering both open-source and proprietary domains. The experimental results expose the difficulties MLLMs face in comprehending humor from visual cues alone. For example, all models exhibit a marked performance drop on caption matching when moving from text-based to video-based evaluation (without audio). Our findings also demonstrate that incorporating audio helps with video humor understanding, highlighting the informativeness of sound and the promise of integrating richer modalities for complex video understanding tasks.
Knowledge Crosswords: Geometric Reasoning over Structured Knowledge with Large Language Models
Ding, Wenxuan, Feng, Shangbin, Liu, Yuhan, Tan, Zhaoxuan, Balachandran, Vidhisha, He, Tianxing, Tsvetkov, Yulia
Large language models (LLMs) are widely adopted in knowledge-intensive tasks and have achieved impressive performance thanks to their knowledge abilities. While LLMs have demonstrated outstanding performance on atomic or linear (multi-hop) QA tasks, whether they can reason in knowledge-rich scenarios with interweaving constraints remains an underexplored problem. In this work, we propose geometric reasoning over structured knowledge, where pieces of knowledge are connected in a graph structure and models need to fill in the missing information of this graph. Such geometric knowledge reasoning would require the ability to handle structured knowledge, reason with uncertainty, verify facts, and backtrack when an error occurs. Further analysis reveals that LLMs' ability of geometric reasoning over structured knowledge is still far from robust or perfect, susceptible to confounders such as the order of options, certain structural patterns, assumption of existence of correct answer, and more. Large language models (LLMs) have demonstrated an impressive ability on knowledge-intensive tasks such as open-domain QA (Petroni et al., 2019), misinformation detection (Karimi & Tang, 2019), and fact-checking (Gao et al., 2023). To assess the knowledge abilities of LLMs, existing tasks and datasets mostly focus on atomic (e.g., open-domain QA) (Rajpurkar et al., 2016; Das et al., 2022) or linear (e.g., multi-hop QA) (Press et al., 2022) settings, probing LLMs' responses to simple or multiple concatenated facts where each reasoning step has a unique definite answer. However, knowledge is not always arranged in a simple linear manner: it often involves more complex structural information, forming an interweaving network that connects various entities and relations through multiple chains as illustrated in Figure 1. Each reasoning step of atomic or linear QAs leads to a unique and definite (intermediate) answer, while multiple candidates exist before all constraints are jointly considered in geometric QA. Consequently, an underexplored yet crucial question arises: Can LLMs extend beyond linear compositionality and aggregate information from multiple chains along with various knowledge constraints? Specifically, when certain pieces of knowledge are missing, can LLMs successfully fill in the blanks based on existing constraints represented by other available information in the network? In this work, we evaluate how well models can aggregate information from the given constraints across a graph representing pieces of knowledge and figure out the blanks in this graph.
Black And White Movies Coloured By Artificial Intelligence
Colourisation or adding colours to the black and white or monochrome images and videos has witnessed a widespread adoption for a few decades now. Traditional colourisation techniques need a lot of human efforts as well as are costlier. However, with the advent of emerging technologies like artificial intelligence, these two major issues are disappearing slowly. Not only this but also we have witnessed how researchers are using deepfake techniques to swap faces of celebrities and other popular faces around the globe. Let's take a look at the few movies that have been coloured using artificial intelligence.
Be vigilant
Sophia, the worlds most advanced humanoid released to date was granted an honorary citizenship a few months ago by Saudi Arabia. In a move that set the net flooding with awe and dismay, this act probably triggered the first step towards recognising artificial intelligence being in the room and not at door step. The UN joined to recognise Sophia as the world's first UN Innovation Champion by UNDP. While these moves were music to many, artificial intelligence is raising a lot of divided opinions across the best of brains in science and technology. A quote widely in circulation on the social media on Einstein's premonition of a world having a generation of idiots may have its fair share of laughs. Einstein had indeed written a letter to his friend, psychiatrist Otto Juliusburger, in 1948 where he believed that the abominable deterioration of ethical standards stemmed primarily from the mechanisation and depersonalisation of our lives, a disastrous byproduct of science and technology.
Project Murphy Microsoft Bot Framework AI
With Microsoft AI-based Bot Framework you can add the bot on Skype, Messenger, Telegram, ... and ask it questions like: "What if Charlie Chaplin was a baby?" or "What if Beethoven was a rockstar!" The results are always fun. Project Murphy is Microsoft's AI-based services using the Microsoft Bot Framework http://www.projectmurphy.net/ With the Bot Framework you can add the bot on Skype, Messenger, telegram etc.. and ask it all life's most important questions such as: "What if Charlie Chaplin was a baby?" or "What if Beethoven was a rockstar!" Project Murphy then uses artificial intelligence to answer these questions by combining the subject's face with the object of interest i.e. a baby's face smartly added on top of Charlie Chaplin's face.