Goto

Collaborating Authors

 movie poster


Demystifying ChatGPT: How It Masters Genre Recognition

Raj, Subham, Saha, Sriparna, Singh, Brijraj, Pedanekar, Niranjan

arXiv.org Artificial Intelligence

The introduction of ChatGPT has garnered significant attention within the NLP community and beyond. Previous studies have demonstrated ChatGPT's substantial advancements across various downstream NLP tasks, highlighting its adaptability and potential to revolutionize language-related applications. However, its capabilities and limitations in genre prediction remain unclear. This work analyzes three Large Language Models (LLMs) using the MovieLens-100K dataset to assess their genre prediction capabilities. Our findings show that ChatGPT, without fine-tuning, outperformed other LLMs, and fine-tuned ChatGPT performed best overall. We set up zero-shot and few-shot prompts using audio transcripts/subtitles from movie trailers in the MovieLens-100K dataset, covering 1682 movies of 18 genres, where each movie can have multiple genres. Additionally, we extended our study by extracting IMDb movie posters to utilize a Vision Language Model (VLM) with prompts for poster information. This fine-grained information was used to enhance existing LLM prompts. In conclusion, our study reveals ChatGPT's remarkable genre prediction capabilities, surpassing other language models. The integration of VLM further enhances our findings, showcasing ChatGPT's potential for content-related applications by incorporating visual information from movie posters.


Generate, Not Recommend: Personalized Multimodal Content Generation

Liu, Jiongnan, Dou, Zhicheng, Hu, Ning, Xiong, Chenyan

arXiv.org Artificial Intelligence

To address the challenge of information overload from massive web contents, recommender systems are widely applied to retrieve and present personalized results for users. However, recommendation tasks are inherently constrained to filtering existing items and lack the ability to generate novel concepts, limiting their capacity to fully satisfy user demands and preferences. In this paper, we propose a new paradigm that goes beyond content filtering and selecting: directly generating personalized items in a multimodal form, such as images, tailored to individual users. To accomplish this, we leverage any-to-any Large Multimodal Models (LMMs) and train them in both supervised fine-tuning and online reinforcement learning strategy to equip them with the ability to yield tailored next items for users. Experiments on two benchmark datasets and user study confirm the efficacy of the proposed method. Notably, the generated images not only align well with users' historical preferences but also exhibit relevance to their potential future interests.


Unraveling Movie Genres through Cross-Attention Fusion of Bi-Modal Synergy of Poster

Nareti, Utsav Kumar, Adak, Chandranath, Chattopadhyay, Soumi, Wang, Pichao

arXiv.org Artificial Intelligence

Movie posters are not just decorative; they are meticulously designed to capture the essence of a movie, such as its genre, storyline, and tone/vibe. For decades, movie posters have graced cinema walls, billboards, and now our digital screens as a form of digital posters. Movie genre classification plays a pivotal role in film marketing, audience engagement, and recommendation systems. Previous explorations into movie genre classification have been mostly examined in plot summaries, subtitles, trailers and movie scenes. Movie posters provide a pre-release tantalizing glimpse into a film's key aspects, which can ignite public interest. In this paper, we presented the framework that exploits movie posters from a visual and textual perspective to address the multilabel movie genre classification problem. Firstly, we extracted text from movie posters using an OCR and retrieved the relevant embedding. Next, we introduce a cross-attention-based fusion module to allocate attention weights to visual and textual embedding. In validating our framework, we utilized 13882 posters sourced from the Internet Movie Database (IMDb). The outcomes of the experiments indicate that our model exhibited promising performance and outperformed even some prominent contemporary architectures.


Demystifying Visual Features of Movie Posters for Multi-Label Genre Identification

Nareti, Utsav Kumar, Adak, Chandranath, Chattopadhyay, Soumi

arXiv.org Artificial Intelligence

In the film industry, movie posters have been an essential part of advertising and marketing for many decades, and continue to play a vital role even today in the form of digital posters through online, social media and OTT platforms. Typically, movie posters can effectively promote and communicate the essence of a film, such as its genre, visual style/ tone, vibe and storyline cue/ theme, which are essential to attract potential viewers. Identifying the genres of a movie often has significant practical applications in recommending the film to target audiences. Previous studies on movie genre identification are limited to subtitles, plot synopses, and movie scenes that are mostly accessible after the movie release. Posters usually contain pre-release implicit information to generate mass interest. In this paper, we work for automated multi-label genre identification only from movie poster images, without any aid of additional textual/meta-data information about movies, which is one of the earliest attempts of its kind. Here, we present a deep transformer network with a probabilistic module to identify the movie genres exclusively from the poster. For experimental analysis, we procured 13882 number of posters of 13 genres from the Internet Movie Database (IMDb), where our model performances were encouraging and even outperformed some major contemporary architectures.


Movie Box Office Prediction With Self-Supervised and Visually Grounded Pretraining

Chao, Qin, Kim, Eunsoo, Li, Boyang

arXiv.org Artificial Intelligence

Investments in movie production are associated with a high level of risk as movie revenues have long-tailed and bimodal distributions. Accurate prediction of box-office revenue may mitigate the uncertainty and encourage investment. However, learning effective representations for actors, directors, and user-generated content-related keywords remains a challenging open problem. In this work, we investigate the effects of self-supervised pretraining and propose visual grounding of content keywords in objects from movie posters as a pertaining objective. Experiments on a large dataset of 35,794 movies demonstrate significant benefits of self-supervised training and visual grounding. In particular, visual grounding pretraining substantially improves learning on movies with content keywords and achieves 14.5% relative performance gains compared to a finetuned BERT model with identical architecture.


Barbie Selfie Generator uses AI to transform YOUR photos into movie posters - here's how to try it

Daily Mail - Science & tech

Barbie fans around the world are counting down to the release of the Barbie movie, which is set to land in theatres on July 21, 2023. Posters for the film dropped yesterday, showing stars including Margot Robbie, 32, and Ryan Gosling, 42 in their iconic roles. But have you ever wondered what your character might look like in Barbieland? Well now you can find out, thanks to a new tool dubbed the Barbie Selfie Generator. The tool uses artificial intelligence (AI) to transform your photos into Barbie movie posters - here's how to try it yourself.


What's your favorite scary movie? AI reimagines classic horror film posters

Daily Mail - Science & tech

Artificial intelligence has reimagined movie posters of popular horror films just in time for Halloween - and the results are teeming with blood, gore and terror. A graphic design team inputted key words like mask, black cloak and blood to inspire the AI-powered app Wonder that brought the nightmares to life. The popular 1996 slasher film Scream features a woman with blue eyes and covering her mouth on its movie poster, but the AI created a hooded figure with a mask that is dripping in blood that is'arguably even more terrifying than the original.' The visuals were created using an app that asks users to describe what they want to see in the digital artwork, which has become a new medium recently. A graphic design team inputted key words like mask, black cloak and blood to inspire the AI-powered app Wonder that brought the nightmares to life.


The DALL-E AI Program Draws Anything You Ask It to

#artificialintelligence

DALL-E is an open-source artificial intelligence that will draw nine images based on what it learned on the internet. Enter in any prompt and the AI spits out the graphics. It continues to learn so the more we all use it, the better it will be. The free version is now called Craiyon and also comes as an app for Android devices, making it even easier to create weird art based on a Mad Lib-like string of random ideas. Who doesn't want to see a giant squid assembling IKEA furniture?


Starring John Cho as Captain America

Slate

You don't need to wait until the summer premiere of Crazy Rich Asians to see Fresh Off the Boat's Constance Wu headline a big-budget movie. Here she is starring in the live-action Ghost in the Shell, and here she is in Luc Besson's Lucy, and here she is as Black Widow in Avengers: Age of Ultron. Yes, those are all films that starred Scarlett Johansson, but not in the corner of the web containing the new social media campaign #SeeAsAmStar, where Wu enjoys a retconned blockbuster career. Deepfakes are mostly associated with scarily realistic pornography featuring the faces of famous actresses superimposed onto the bodies of adult performers. The campaign, which appeared last week, employs the deepfake toolkit for a nobler purpose: the fight for better representation of minorities in pop culture.


Adobe says it wants AI to amplify human creativity and intelligence

#artificialintelligence

About a year ago, Adobe announced its Sensei AI platform. Unlike other companies, Adobe says that it has no interest in building a general artificial intelligence platform -- instead, it wants to build a platform squarely focused on helping its customers be more creative. This week, at its Max conference, Adobe provided both more insight into what this means and showed off a number of prototypes for how it plans to integrate Sensei into its flagship tools. "We are not building a general purpose AI platform like some others in the industry are -- and it's great that they are building it," Adobe CTO Abhay Parasnis noted in a press conference after today's keynote. "We have a very deep understanding of how creative professionals work in imagining, in photography, in video, in design and illustration. So we have taken decades worth of learning of those very specific domains -- and that's where a large part of this comes in. When one of the very best artists in Photoshop spends hours in creation, what are the other things they do and maybe more importantly, what are the things they don't do? We are trying to harness that and marry that with the latest advances in deep learning so that the algorithms can actually become partners for that creative professional."