Spatio-Temporal Interactive Learning for Efficient Image Reconstruction of Spiking Cameras
–Neural Information Processing Systems
The spiking camera is an emerging neuromorphic vision sensor that records highspeed motion scenes by asynchronously firing continuous binary spike streams. Prevailing image reconstruction methods, generating intermediate frames from these spike streams, often rely on complex step-by-step network architectures that overlook the intrinsic collaboration of spatio-temporal complementary information. In this paper, we propose an efficient spatio-temporal interactive reconstruction network to jointly perform inter-frame feature alignment and intra-frame feature filtering in a coarse-to-fine manner. Specifically, it starts by extracting hierarchical features from a concise hybrid spike representation, then refines the motion fields and target frames scale-by-scale, ultimately obtaining a full-resolution output. Meanwhile, we introduce a symmetric interactive attention block and a multimotion field estimation block to further enhance the interaction capability of the overall network. Experiments on synthetic and real-captured data show that our approach exhibits excellent performance while maintaining low model complexity. The code is available at https://github.com/GitCVfb/STIR.
Neural Information Processing Systems
May-28-2025, 19:51:51 GMT
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Education > Educational Setting
- Online (0.50)
- Information Technology (0.46)
- Education > Educational Setting
- Technology: