Siamese-Driven Optimization for Low-Resolution Image Latent Embedding in Image Captioning

Tan, Jing Jie, Mokraoui, Anissa, Kwan, Ban-Hoe, Ng, Danny Wee-Kiat, Hum, Yan-Chai

Dec-10-2025–arXiv.org Artificial Intelligence

Abstract--Image captioning is essential in many fields including assisting visually impaired individuals, improving content management systems, and enhancing human-computer interaction. However, a recent challenge in this domain is dealing with low-resolution image (LRI). While performance can be improved by using larger models like transformers for encoding, these models are typically heavyweight, demanding significant computational resources and memory, leading to challenges in retraining. T o address this, the proposed SOLI (Siamese-Driven Optimization for Low-Resolution Image Latent Embedding in Image Captioning) approach presents a solution specifically designed for lightweight, low-resolution images captioning. It employs a Siamese network architecture to optimize latent embeddings, enhancing the efficiency and accuracy of the image-to-text translation process.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

Dec-10-2025

arXiv.org PDF

Add feedback

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found