AITopics

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Neural Information Processing SystemsFeb-18-2026, 16:45:32 GMT

A Benchmark Dataset for Event-Guided Human Pose Estimation and Tracking in Extreme Conditions

artificial intelligence, dataset, machine learning, (12 more...)

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.68)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Neural Information Processing SystemsFeb-15-2026, 22:13:27 GMT

SupplementaryMaterialfor" HierarchicalAdaptive ValueEstimationforMulti-modalVisual ReinforcementLearning "

Section C describes the details of the experimental setup, including network architectures, hyperparameters,andhardwaredetails. Thisoutcomeemphasizes the necessity of feature interaction or feature fusion to tackle intricate situations. Furthermore, an amalgamation of feature fusion and value fusion can offer better performance. This adjustment allows us to evaluate the robustness and adaptability of our approach in handling a larger number of vehicles in the environment. As we increase the number of vehicles on the road, Fig. A2 (a) clearly indicates that HAVE consistently delivers the highest performance. The training and testing curves of HAVE and other comparable methods are given in A4.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Industry: Transportation > Ground > Road (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Neural Information Processing SystemsFeb-7-2026, 10:26:22 GMT

0a73de68f10e15626eb98701ecf03adb-Paper.pdf

ab ba, adversarial example, motion blur, (11 more...)

Country:

Asia > Singapore (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū (0.04)
North America > United States (0.04)
(2 more...)

Genre: Workflow (0.48)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Neural Information Processing SystemsFeb-7-2026, 10:26:11 GMT

0a73de68f10e15626eb98701ecf03adb-AuthorFeedback.pdf

ab ba, blur, motion blur, (13 more...)

Technology: Information Technology > Artificial Intelligence (0.31)

Cheng, Shihan, Kulkarni, Nilesh, Hyde, David, Smirnov, Dmitriy

Less is More: Data-Efficient Adaptation for Controllable Text-to-Video Generation

arXiv.org Artificial IntelligenceDec-12-2025

Fine-tuning large-scale text-to-video diffusion models to add new generative controls, such as those over physical camera parameters (e.g., shutter speed or aperture), typically requires vast, high-fidelity datasets that are difficult to acquire. In this work, we propose a data-efficient fine-tuning strategy that learns these controls from sparse, low-quality synthetic data. W e show that not only does fine-tuning on such simple data enable the desired controls, it actually yields superior results to models fine-tuned on pho-torealistic "real" data. Beyond demonstrating these results, we provide a framework that justifies this phenomenon both intuitively and quantitatively.

large language model, machine learning, natural language, (18 more...)

2511.17844

Genre: Research Report > New Finding (0.93)

Industry: Media > Photography (0.90)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Youk, Geunhyuk, Oh, Jihyong, Kim, Munchurl

FMA-Net++: Motion- and Exposure-Aware Real-World Joint Video Super-Resolution and Deblurring

arXiv.org Artificial IntelligenceDec-5-2025

Real-world video restoration is plagued by complex degradations from motion coupled with dynamically varying exposure - a key challenge largely overlooked by prior works and a common artifact of auto-exposure or low-light capture. We present FMA-Net++, a framework for joint video super-resolution and deblurring that explicitly models this coupled effect of motion and dynamically varying exposure. FMA-Net++ adopts a sequence-level architecture built from Hierarchical Refinement with Bidirectional Propagation blocks, enabling parallel, long-range temporal modeling. Within each block, an Exposure Time-aware Modulation layer conditions features on per-frame exposure, which in turn drives an exposure-aware Flow-Guided Dynamic Filtering module to infer motion- and exposure-aware degradation kernels. FMA-Net++ decouples degradation learning from restoration: the former predicts exposure- and motion-aware priors to guide the latter, improving both accuracy and efficiency. To evaluate under realistic capture conditions, we introduce REDS-ME (multi-exposure) and REDS-RE (random-exposure) benchmarks. Trained solely on synthetic data, FMA-Net++ achieves state-of-the-art accuracy and temporal consistency on our new benchmarks and GoPro, outperforming recent methods in both restoration quality and inference speed, and generalizes well to challenging real-world videos.

artificial intelligence, computer vision, machine learning, (16 more...)

2512.0439

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.97)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceNov-19-2025

DINO-Detect: A Simple yet Effective Framework for Blur-Robust AI-Generated Image Detection

Shen, Jialiang, Zheng, Jiyang, Xue, Yunqi, Chen, Huajie, Yao, Yu, Kang, Hui, Liu, Ruiqi, Gong, Helin, Yang, Yang, Wang, Dadong, Liu, Tongliang

With growing concerns over image authenticity and digital safety, the field of AI-generated image (AIGI) detection has progressed rapidly. Y et, most AIGI detectors still struggle under real-world degradations, particularly motion blur, which frequently occurs in handheld photography, fast motion, and compressed video. Such blur distorts fine textures and suppresses high-frequency artifacts, causing severe performance drops in real-world settings. W e address this limitation with a blur-robust AIGI detection framework based on teacher-student knowledge distillation. A high-capacity teacher (DINOv3), trained on clean (i.e., sharp) images, provides stable and semantically rich representations that serve as a reference for learning. By freezing the teacher to maintain its generalization ability, we distill its feature and logit responses from sharp images to a student trained on blurred counterparts, enabling the student to produce consistent representations under motion degradation. Extensive experiments benchmarks show that our method achieves state-of-the-art performance under both motion-blurred and clean conditions, demonstrating improved generalization and real-world applicability. Source codes will be released at: Project Page.

detection, machine learning, natural language, (14 more...)

2511.12511

Country: Asia (0.46)

Genre:

Research Report (0.50)
Overview (0.48)

Industry:

Education (1.00)
Media > Photography (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Rashid, Umar, Arshad, Muhammad Arslan, Ahmad, Ghulam, Anjum, Muhammad Zeeshan, Khan, Rizwan, Akmal, Muhammad

Hybrid CNN-ViT Framework for Motion-Blurred Scene Text Restoration

arXiv.org Artificial IntelligenceNov-11-2025

Motion blur in scene text images severely impairs readability and hinders the reliability of computer vision tasks, including autonomous driving, document digitization, and visual information retrieval. Conventional deblurring approaches are often inadequate in handling spatially varying blur and typically fall short in modeling the long-range dependencies necessary for restoring textual clarity. To overcome these limitations, we introduce a hybrid deep learning framework that combines convolutional neural networks (CNNs) with vision transformers (ViTs), thereby leveraging both local feature extraction and global contextual reasoning. The architecture employs a CNN-based encoder-decoder to preserve structural details, while a transformer module enhances global awareness through self-attention. Training is conducted on a curated dataset derived from TextOCR, where sharp scene-text samples are paired with synthetically blurred versions generated using realistic motion-blur kernels of multiple sizes and orientations. Model optimization is guided by a composite loss that incorporates mean absolute error (MAE), squared error (MSE), perceptual similarity, and structural similarity (SSIM). Quantitative evaluations show that the proposed method attains 32.20 dB in PSNR and 0.934 in SSIM, while remaining lightweight with 2.83 million parameters and an average inference time of 61 ms. These results highlight the effectiveness and computational efficiency of the CNN-ViT hybrid design, establishing its practicality for real-world motion-blurred scene-text restoration.

artificial intelligence, machine learning, natural language, (17 more...)