LocCa: Visual Pretraining with Location-aware Captioners Bo Wan 1,3 Michael Tschannen 1 Y ongqin Xian
–Neural Information Processing Systems
Specifically, LocCa employs two tasks, bounding box prediction and location-dependent captioning, conditioned on the image pixel input.
Neural Information Processing Systems
Oct-10-2025, 17:36:55 GMT
- Country:
- Europe
- Belgium > Flanders
- Flemish Brabant > Leuven (0.04)
- Switzerland > Zürich
- Zürich (0.04)
- Belgium > Flanders
- Europe
- Genre:
- Research Report
- Experimental Study (0.93)
- New Finding (0.67)
- Research Report
- Technology: