LLaVAC: Fine-tuning LLaVA as a Multimodal Sentiment Classifier
Chay-intr, T., Chen, Y., Viriyayudhakorn, K., Theeramunkong, T.
–arXiv.org Artificial Intelligence
We present LLaVAC, a method for constructing a classifier for multimodal sentiment analysis. This method leverages fine-tuning of the Large Language and Vision Assistant (LLaVA) to predict sentiment labels across both image and text modalities. Our approach involves designing a structured prompt that incorporates both unimodal and multimodal labels to fine-tune LLaVA, enabling it to perform sentiment classification effectively. Experiments on the MVSA-Single dataset demonstrate that LLaVAC outperforms existing methods in multimodal sentiment analysis across three data processing procedures. The implementation of LLaVAC is publicly available at https://github.com/tchayintr/llavac.
arXiv.org Artificial Intelligence
Feb-5-2025
- Country:
- Genre:
- Research Report > New Finding (0.94)
- Industry:
- Information Technology (0.34)
- Technology: