V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation

Jun-21-2026, 23:07:48 GMT–Neural Information Processing Systems

Event-based cameras offer unique advantages such as high temporal resolution, high dynamic range, and low power consumption. However, the massive storage requirements and I/O burdens of existing synthetic data generation pipelines and the scarcity of real data prevent event-based training datasets from scaling up, limiting the development and generalization capabilities of event vision models. To address this challenge, we introduce Video-to-Voxel (V2V), an approach that directly converts conventional video frames into event-based voxel grid representations, bypassing the storage-intensive event stream generation entirely. V2V enables a 150 reduction in storage requirements while supporting on-the-fly parameter randomization for enhanced model robustness. Leveraging this efficiency, we train several video reconstruction and optical flow estimation model architectures on 10,000 diverse videos totaling 52 hours--an order of magnitude larger than existing event datasets, yielding substantial improvements.

artificial intelligence, dataset, machine learning, (16 more...)

Neural Information Processing Systems

Jun-21-2026, 23:07:48 GMT

Conferences PDF

Add feedback

Country:
- Asia > China (0.46)

Genre:
- Research Report
  - Experimental Study (1.00)
  - New Finding (0.93)

Industry:
- Media (0.46)
- Information Technology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Representation & Reasoning (0.88)
  - Machine Learning > Neural Networks (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found