swinir
Supplementary Material: Cross Aggregation Transformer for Image Restoration
We provide two variant models for image SR, called CA T -R-2 and CA T -A-2. The MLP expansion ratio is set as 4. We use self-ensemble strategy and mark models with "+". The results are shown in Table 1. Output size is 3 512 512 to calculate FLOPs. CA T -R-2 achieves 0.22 dB on Urban100 ( All these results further indicate the effectiveness of our method.
Supplementary Material: Cross Aggregation Transformer for Image Restoration
We provide two variant models for image SR, called CA T -R-2 and CA T -A-2. The MLP expansion ratio is set as 4. We use self-ensemble strategy and mark models with "+". The results are shown in Table 1. Output size is 3 512 512 to calculate FLOPs. CA T -R-2 achieves 0.22 dB on Urban100 ( All these results further indicate the effectiveness of our method.
Scaling Laws For Deep Learning Based Image Reconstruction
Deep neural networks trained end-to-end to map a measurement of a (noisy) image to a clean image perform excellent for a variety of linear inverse problems. Current methods are only trained on a few hundreds or thousands of images as opposed to the millions of examples deep networks are trained on in other domains. In this work, we study whether major performance gains are expected from scaling up the training set size. We consider image denoising, accelerated magnetic resonance imaging, and super-resolution and empirically determine the reconstruction quality as a function of training set size, while simultaneously scaling the network size. For all three tasks we find that an initially steep power-law scaling slows significantly already at moderate training set sizes. Interpolating those scaling laws suggests that even training on millions of images would not significantly improve performance. To understand the expected behavior, we analytically characterize the performance of a linear estimator learned with early stopped gradient descent. The result formalizes the intuition that once the error induced by learning the signal model is small relative to the error floor, more training examples do not improve performance.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Masked Autoencoders for Low dose CT denoising
Wang, Dayang, Xu, Yongshun, Han, Shuo, Yu, Hengyong
Low-dose computed tomography (LDCT) reduces the X-ray radiation but compromises image quality with more noises and artifacts. A plethora of transformer models have been developed recently to improve LDCT image quality. However, the success of a transformer model relies on a large amount of paired noisy and clean data, which is often unavailable in clinical applications. In computer vision and natural language processing fields, masked autoencoders (MAE) have been proposed as an effective label-free self-pretraining method for transformers, due to its excellent feature representation ability. Here, we redesign the classical encoder-decoder learning model to match the denoising task and apply it to LDCT denoising problem. The MAE can leverage the unlabeled data and facilitate structural preservation for the LDCT denoising model when ground truth data are missing. Experiments on the Mayo dataset validate that the MAE can boost the transformer's denoising performance and relieve the dependence on the ground truth data.
This AI makes blurry faces look 8 times sharper! SwinIR: Photo Upsampling
Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. It's free, we don't spam, and we never share your email address.