HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding

Open in new window