Goto

Collaborating Authors

 Large Language Model


Visual Perception by Large Language Model's Weights

Neural Information Processing Systems

In this way, the input of LLM does not require visual tokens, which reduces the length of the input sequence and greatly improves efficiency. Following this paradigm, we propose VLoRA with the perceptual weights generator.