LaViDa: A Large Diffusion Language Model for Multimodal Understanding

Jun-13-2026, 09:51:44 GMT–Neural Information Processing Systems

Modern Vision-Language Models (VLMs) can solve a wide range of tasks requiring visual reasoning. In real-world scenarios, desirable properties for VLMs include fast inference and controllable generation (e.g., constraining outputs to adhere to a desired format).

artificial intelligence, name change, proceedings, (7 more...)

Neural Information Processing Systems

Jun-13-2026, 09:51:44 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence (0.99)