Q-VLM: Post-training Quantization for Large Vision-Language Models

Neural Information Processing Systems 

In this paper, we propose a post-training quantization framework of large vision-language models (L VLMs) for efficient multi-modal inference. Conventional quantization methods sequentially search the layer-wise rounding functions by minimizing activation discretization errors, which fails to acquire optimal quantization strategy without considering cross-layer dependency.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found