LaViDa: ALarge Diffusion Language Model for Multimodal Understanding
–Neural Information Processing Systems
Modern Vision-Language Models (VLMs) can solve a wide range of tasks requiring visual reasoning. In real-world scenarios, desirable properties for VLMs include fast inference and controllable generation (e.g., constraining outputs to adhere to a desired format).
Neural Information Processing Systems
Jun-20-2026, 04:11:40 GMT
- Country:
- Europe (0.67)
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (0.92)
- Research Report
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Representation & Reasoning (1.00)
- Natural Language
- Large Language Model (1.00)
- Chatbot (0.93)
- Machine Learning > Neural Networks
- Deep Learning (0.93)
- Information Technology > Artificial Intelligence