AITopics | frame rate

Supplementary Material for Enhancing Motion Deblurring in High-Speed Scenes with Spike Streams Shiyan Chen

Neural Information Processing SystemsFeb-17-2026, 12:50:58 GMT

All RSTB blocks consist of 6 STB blocks. Each sequence contains 33 frames. Blurry images with different motion magnitudes are generated by averaging the surrounding 33 or 65 images. S1, we observe that the introduction of CAMMA also improves the performance of de-blurring across all settings. We have added comparisons regarding computational complexity and inference time in Tab.

artificial intelligence, machine learning, spike stream, (11 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel (0.06)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

7f05193e5487287a890df7fbc3554427-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 12:19:09 GMT

artificial intelligence, deep learning, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Tianjin Province > Tianjin (0.05)
Europe > Finland > South Karelia > Lappeenranta (0.04)
Asia > Middle East > Israel (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.70)
Information Technology > Communications (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

EPIC-KITCHENSVISORBenchmark VIdeoSegmentationsandObjectRelations-Appendix

Neural Information Processing SystemsFeb-9-2026, 04:29:17 GMT

Is it possible to identify individuals (i.e., one or more natural persons), either directly or indirectly(i.e.,incombinationwithotherdata)fromthedataset?

annotation, artificial intelligence, ifso, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Michigan (0.04)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Blinking Beyond EAR: A Stable Eyelid Angle Metric for Driver Drowsiness Detection and Data Augmentation

Wolter, Mathis, Perez, Julie Stephany Berrio, Shan, Mao

arXiv.org Artificial IntelligenceNov-26-2025

Abstract-- Detecting driver drowsiness reliably is crucial for enhancing road safety and supporting advanced driver assistance systems (ADAS). We introduce the Eyelid Angle (ELA), a novel, reproducible metric of eye openness derived from 3D facial landmarks. Unlike conventional binary eye state estimators or 2D measures, such as the Eye Aspect Ratio (EAR), the ELA provides a stable geometric description of eyelid motion that is robust to variations in camera angle. Using the ELA, we design a blink detection framework that extracts temporal characteristics, including the closing, closed, and reopening durations, which are shown to correlate with drowsiness levels. T o address the scarcity and risk of collecting natural drowsiness data, we further leverage ELA signals to animate rigged avatars in Blender 3D, enabling the creation of realistic synthetic datasets with controllable noise, camera viewpoints, and blink dynamics. Experimental results in public driver monitoring datasets demonstrate that the ELA offers lower variance under viewpoint changes compared to EAR and achieves accurate blink detection. At the same time, synthetic augmentation expands the diversity of training data for drowsiness recognition. Our findings highlight the ELA as both a reliable biometric measure and a powerful tool for generating scalable datasets in driver state monitoring. URL: The link with the code will be made publicly available upon acceptance.

artificial intelligence, machine learning, video, (18 more...)

arXiv.org Artificial Intelligence

2511.19519

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.48)

Industry:

Transportation > Ground > Road (0.66)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Apple MacBook Pro (M5, 14-Inch) Review: More of the Same

WIREDOct-21-2025, 17:00:00 GMT

It's not as exciting as it once was, but the M5 MacBook Pro offers yet another breakthrough in performance. All products featured on WIRED are independently selected by our editors. However, when you buy something through our retail links, we may earn an affiliate commission. M5 is a beast, especially in graphics and AI. New GPU is surprisingly good at gaming. Battery life, speakers, build quality, keyboard, and trackpad are all still world-class.

apple, apple macbook, macbook, (14 more...)

WIRED

Country:

North America > United States > California (0.04)
Europe > Slovakia (0.04)
Europe > Czechia (0.04)

Technology:

Information Technology > Hardware (0.69)
Information Technology > Communications > Mobile (0.47)
Information Technology > Artificial Intelligence (0.47)

Add feedback

U-Codec: Ultra Low Frame-rate Neural Speech Codec for Fast High-fidelity Speech Generation

Yang, Xusheng, Zhou, Long, Wang, Wenfu, Hu, Kai, Feng, Shulin, Li, Chenxing, Yu, Meng, Yu, Dong, Zou, Yuexian

arXiv.org Artificial IntelligenceOct-21-2025

We propose \textbf{U-Codec}, an \textbf{U}ltra low frame-rate neural speech \textbf{Codec} that achieves high-fidelity reconstruction and fast speech generation at an extremely low frame-rate of 5Hz (5 frames per second). Extreme compression at 5Hz typically leads to severe intelligibility and spectral detail loss, we introduce a Transformer-based inter-frame long-term dependency module and systematically explore residual vector quantization (RVQ) depth and codebook size to identify optimal configurations. Moreover, we apply U-Codec into a large language model (LLM)-based auto-regressive TTS model, which leverages global and local hierarchical architecture to effectively capture dependencies across multi-layer tokens. We extend LLM-based TTS from 3-layer RVQ at 50Hz to 32-layer RVQ at 5Hz. Experimental results demonstrate that U-Codec improves LLM-based TTS inference speed by around 3 $\times$ over high-frame-rate codecs while maintaining similarity and naturalness. These results validate the feasibility of using highly compressed 5Hz discrete tokens for fast and high-fidelity speech synthesis.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.16718

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

25b9960c8a5bd887eb5476c951260403-Supplemental-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsOct-9-2025, 21:14:10 GMT

spatial resolution, time-lapse video, video, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Asia > Singapore (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology (0.68)
Consumer Products & Services (0.46)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Supplementary Material for Enhancing Motion Deblurring in High-Speed Scenes with Spike Streams Shiyan Chen

Neural Information Processing SystemsOct-9-2025, 09:30:24 GMT

All RSTB blocks consist of 6 STB blocks. Each sequence contains 33 frames. Blurry images with different motion magnitudes are generated by averaging the surrounding 33 or 65 images. S1, we observe that the introduction of CAMMA also improves the performance of de-blurring across all settings. We have added comparisons regarding computational complexity and inference time in Tab.

artificial intelligence, machine learning, spike stream, (11 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel (0.06)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

7f05193e5487287a890df7fbc3554427-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 23:34:51 GMT

artificial intelligence, deep learning, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Tianjin Province > Tianjin (0.05)
Europe > Finland > South Karelia > Lappeenranta (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.70)
Information Technology > Communications (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption

Zhong, Tianxiong, Tian, Xingye, Jiang, Boyuan, Wang, Xuebo, Tao, Xin, Wan, Pengfei, Zhang, Zhiwei

arXiv.org Artificial IntelligenceSep-30-2025

Modern video generation frameworks based on Latent Diffusion Models suffer from inefficiencies in tokenization due to the Frame-Proportional Information Assumption. Existing tokenizers provide fixed temporal compression rates, causing the computational cost of the diffusion model to scale linearly with the frame rate. The paper proposes the Duration-Proportional Information Assumption: the upper bound on the information capacity of a video is proportional to the duration rather than the number of frames. Based on this insight, the paper introduces VFRTok, a Transformer-based video tokenizer, that enables variable frame rate encoding and decoding through asymmetric frame rate training between the encoder and decoder. Furthermore, the paper proposes Partial Rotary Position Embeddings (RoPE) to decouple position and content modeling, which groups correlated patches into unified tokens. The Partial RoPE effectively improves content-awareness, enhancing the video generation capability. Benefiting from the compact and continuous spatio-temporal representation, VFRTok achieves competitive reconstruction quality and state-of-the-art generation fidelity while using only 1/8 tokens compared to existing tokenizers. The code and weights are released at: https://github.com/KwaiVGI/VFRTok.

artificial intelligence, arxiv preprint arxiv, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2505.12053

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

frame rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Supplementary Material for Enhancing Motion Deblurring in High-Speed Scenes with Spike Streams Shiyan Chen

7f05193e5487287a890df7fbc3554427-Paper-Conference.pdf

EPIC-KITCHENSVISORBenchmark VIdeoSegmentationsandObjectRelations-Appendix

Blinking Beyond EAR: A Stable Eyelid Angle Metric for Driver Drowsiness Detection and Data Augmentation

Apple MacBook Pro (M5, 14-Inch) Review: More of the Same

U-Codec: Ultra Low Frame-rate Neural Speech Codec for Fast High-fidelity Speech Generation

25b9960c8a5bd887eb5476c951260403-Supplemental-Datasets_and_Benchmarks_Track.pdf

Supplementary Material for Enhancing Motion Deblurring in High-Speed Scenes with Spike Streams Shiyan Chen

7f05193e5487287a890df7fbc3554427-Paper-Conference.pdf

VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption