Visual Perception by Large Language Model's Weights
–Neural Information Processing Systems
In this way, the input of LLM does not require visual tokens, which reduces the length of the input sequence and greatly improves efficiency. Following this paradigm, we propose VLoRA with the perceptual weights generator.
Neural Information Processing Systems
Feb-10-2026, 20:47:27 GMT
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Education > Curriculum > Subject-Specific Education (0.46)
- Technology: