Goto

Collaborating Authors

 Chen, Linwei


Frequency Dynamic Convolution for Dense Image Prediction

arXiv.org Artificial Intelligence

While Dynamic Convolution (DY-Conv) has shown promising performance by enabling adaptive weight selection through multiple parallel weights combined with an attention mechanism, the frequency response of these weights tends to exhibit high similarity, resulting in high parameter costs but limited adaptability. In this work, we introduce Frequency Dynamic Convolution (FDConv), a novel approach that mitigates these limitations by learning a fixed parameter budget in the Fourier domain. FDConv divides this budget into frequency-based groups with disjoint Fourier indices, enabling the construction of frequency-diverse weights without increasing the parameter cost. To further enhance adaptability, we propose Kernel Spatial Modulation (KSM) and Frequency Band Modulation (FBM). KSM dynamically adjusts the frequency response of each filter at the spatial level, while FBM decomposes weights into distinct frequency bands in the frequency domain and modulates them dynamically based on local content. Extensive experiments on object detection, segmentation, and classification validate the effectiveness of FDConv. We demonstrate that when applied to ResNet-50, FDConv achieves superior performance with a modest increase of +3.6M parameters, outperforming previous methods that require substantial increases in parameter budgets (e.g., CondConv +90M, KW +76.5M). Moreover, FDConv seamlessly integrates into a variety of architectures, including ConvNeXt, Swin-Transformer, offering a flexible and efficient solution for modern vision tasks. The code is made publicly available at https://github.com/Linwei-Chen/FDConv.


Instance Segmentation in the Dark

arXiv.org Artificial Intelligence

Noname manuscript No. (will be inserted by the editor) Abstract Existing instance segmentation techniques are primarily depth can be critical for low-light instance segmentation. To tailored for high-visibility inputs, but their performance mitigate the scarcity of annotated RAW datasets, we leverage significantly deteriorates in extremely low-light environments. In addition, to facilitate further research in the dark and introduce several techniques that in this direction, we capture a real-world low-light instance substantially boost the low-light inference accuracy. The proposed segmentation dataset comprising over two thousand paired method is motivated by the observation that noise in low/normal-light images with instance-level pixel-wise annotations. To suppress this "feature noise", we in very low light (4 % AP higher than state-of-the-art propose a novel learning method that relies on an adaptive competitors), meanwhile opening new opportunities for future weighted downsampling layer, a smooth-oriented convolutional research. Our code and dataset are publicly available to block, and disturbance suppression learning. Furthermore, we discover that high-bit-depth RAW images can better preserve richer scene information in low-light conditions compared 1 Introduction to typical camera sRGB outputs, thus supporting the use of RAW-input algorithms. "buried" by severe noise caused by limited photon count and They substantially improve the capability of models to learn noiseresisted features and thus boost the low-light segmentation accuracy appreciably. It is worth noting that they are modelagnostic and lightweight or even cost-free. It aggregates local features adaptively and suppresses the high-frequency disturbance caused by noise as well as keeping the details in deep features. The smoothoriented convolutional block enhances the ordinary convolutional layers by adding a smooth-oriented convolution branch. Relevant Moreover, we notice that the high bit-depth can be crucial low-light recognition/detection methods (Cui et al., 2021; for low-light conditions.