End-to-end Multi-modal Video Temporal Grounding Yi-Wen Chen
–Neural Information Processing Systems
To integrate the three modalities more effectively and enable inter-modal learning, we design a dynamic fusion scheme with transformers to model the interactions between modalities.
Neural Information Processing Systems
Aug-18-2025, 17:11:57 GMT
- Country:
- North America > United States > California > Merced County > Merced (0.04)
- Genre:
- Research Report (0.46)
- Technology: