Goto

Collaborating Authors

 frame rate


Supplementary Material for Enhancing Motion Deblurring in High-Speed Scenes with Spike Streams Shiyan Chen

Neural Information Processing Systems

All RSTB blocks consist of 6 STB blocks. Each sequence contains 33 frames. Blurry images with different motion magnitudes are generated by averaging the surrounding 33 or 65 images. S1, we observe that the introduction of CAMMA also improves the performance of de-blurring across all settings. We have added comparisons regarding computational complexity and inference time in Tab.



Blinking Beyond EAR: A Stable Eyelid Angle Metric for Driver Drowsiness Detection and Data Augmentation

Wolter, Mathis, Perez, Julie Stephany Berrio, Shan, Mao

arXiv.org Artificial Intelligence

Abstract-- Detecting driver drowsiness reliably is crucial for enhancing road safety and supporting advanced driver assistance systems (ADAS). We introduce the Eyelid Angle (ELA), a novel, reproducible metric of eye openness derived from 3D facial landmarks. Unlike conventional binary eye state estimators or 2D measures, such as the Eye Aspect Ratio (EAR), the ELA provides a stable geometric description of eyelid motion that is robust to variations in camera angle. Using the ELA, we design a blink detection framework that extracts temporal characteristics, including the closing, closed, and reopening durations, which are shown to correlate with drowsiness levels. T o address the scarcity and risk of collecting natural drowsiness data, we further leverage ELA signals to animate rigged avatars in Blender 3D, enabling the creation of realistic synthetic datasets with controllable noise, camera viewpoints, and blink dynamics. Experimental results in public driver monitoring datasets demonstrate that the ELA offers lower variance under viewpoint changes compared to EAR and achieves accurate blink detection. At the same time, synthetic augmentation expands the diversity of training data for drowsiness recognition. Our findings highlight the ELA as both a reliable biometric measure and a powerful tool for generating scalable datasets in driver state monitoring. URL: The link with the code will be made publicly available upon acceptance.


a9ad5f2808f68eea468621a04c49efe1-AuthorFeedback.pdf

Neural Information Processing Systems

This is used to estimate the model capabilities over the range of wind speeds. In response to the reviewer's comments, we have significantly expanded our training and



Apple MacBook Pro (M5, 14-Inch) Review: More of the Same

WIRED

It's not as exciting as it once was, but the M5 MacBook Pro offers yet another breakthrough in performance. All products featured on WIRED are independently selected by our editors. However, when you buy something through our retail links, we may earn an affiliate commission. M5 is a beast, especially in graphics and AI. New GPU is surprisingly good at gaming. Battery life, speakers, build quality, keyboard, and trackpad are all still world-class.


U-Codec: Ultra Low Frame-rate Neural Speech Codec for Fast High-fidelity Speech Generation

Yang, Xusheng, Zhou, Long, Wang, Wenfu, Hu, Kai, Feng, Shulin, Li, Chenxing, Yu, Meng, Yu, Dong, Zou, Yuexian

arXiv.org Artificial Intelligence

We propose \textbf{U-Codec}, an \textbf{U}ltra low frame-rate neural speech \textbf{Codec} that achieves high-fidelity reconstruction and fast speech generation at an extremely low frame-rate of 5Hz (5 frames per second). Extreme compression at 5Hz typically leads to severe intelligibility and spectral detail loss, we introduce a Transformer-based inter-frame long-term dependency module and systematically explore residual vector quantization (RVQ) depth and codebook size to identify optimal configurations. Moreover, we apply U-Codec into a large language model (LLM)-based auto-regressive TTS model, which leverages global and local hierarchical architecture to effectively capture dependencies across multi-layer tokens. We extend LLM-based TTS from 3-layer RVQ at 50Hz to 32-layer RVQ at 5Hz. Experimental results demonstrate that U-Codec improves LLM-based TTS inference speed by around 3 $\times$ over high-frame-rate codecs while maintaining similarity and naturalness. These results validate the feasibility of using highly compressed 5Hz discrete tokens for fast and high-fidelity speech synthesis.



Supplementary Material for Enhancing Motion Deblurring in High-Speed Scenes with Spike Streams Shiyan Chen

Neural Information Processing Systems

All RSTB blocks consist of 6 STB blocks. Each sequence contains 33 frames. Blurry images with different motion magnitudes are generated by averaging the surrounding 33 or 65 images. S1, we observe that the introduction of CAMMA also improves the performance of de-blurring across all settings. We have added comparisons regarding computational complexity and inference time in Tab.