Collaborating Authors

A Comparison of Super-Resolution and Nearest Neighbors Interpolation Applied to Object Detection on Satellite Data Machine Learning

As Super-Resolution (SR) has matured as a research topic, it has been applied to additional topics beyond image reconstruction. In particular, combining classification or object detection tasks with a super-resolution preprocessing stage has yielded improvements in accuracy especially with objects that are small relative to the scene. While SR has shown promise, a study comparing SR and naive upscaling methods such as Nearest Neighbors (NN) interpolation when applied as a preprocessing step for object detection has not been performed. We apply the topic to satellite data and compare the Multi-scale Deep Super-Resolution (MDSR) system to NN on the xView challenge dataset. To do so, we propose a pipeline for processing satellite data that combines multi-stage image tiling and upscaling, the YOLOv2 object detection architecture, and label stitching. We compare the effects of training models using an upscaling factor of 4, upscaling images from 30cm Ground Sample Distance (GSD) to an effective GSD of 7.5cm. Upscaling by this factor significantly improves detection results, increasing Average Precision (AP) of a generalized vehicle class by 23 percent. We demonstrate that while SR produces upscaled images that are more visually pleasing than their NN counterparts, object detection networks see little difference in accuracy with images upsampled using NN obtaining nearly identical results to the MDSRx4 enhanced images with a difference of 0.0002 AP between the two methods.

Google's New AI Photo Upscaling Tech is Jaw-Dropping


Photo enhancing in movies and TV shows is often ridiculed for being unbelievable, but research in real photo enhancing is actually creeping more and more into the realm of science fiction. Just take a look at Google's latest AI photo upscaling tech. In a post titled "High Fidelity Image Generation Using Diffusion Models" published on the Google AI Blog (and spotted by DPR), Google researchers in the company's Brain Team share about new breakthroughs they've made in image super-resolution. In image super-resolution, a machine learning model is trained to turn a low-res photo into a detailed high-res photo, and potential applications of this range from restoring old family photos to improving medical imaging. Google has been exploring a concept called "diffusion models," which was first proposed in 2015 but which has, up until recently, taken a backseat to a family of deep learning methods called "deep generative models."

The Illustrated Self-Supervised Learning


Yann Lecun, in his talk, introduced the "cake analogy" to illustrate the importance of self-supervised learning. Though the analogy is debated(ref: Deep Learning for Robotics(Slide 96), Pieter Abbeel), we have seen the impact of self-supervised learning in the Natural Language Processing field where recent developments (Word2Vec, Glove, ELMO, BERT) have embraced self-supervision and achieved state of the art results. "If intelligence is a cake, the bulk of the cake is self-supervised learning, the icing on the cake is supervised learning, and the cherry on the cake is reinforcement learning (RL)." Curious to know how self-supervised learning has been applied in the computer vision field, I read up on existing literature on self-supervised learning applied to computer vision through a recent survey paper by Jing et. This post is my attempt to provide an intuitive visual summary of the patterns of problem formulation in self-supervised learning.

Super-resolution using Deep Learning methods: A Survey


Image super-resolution refers to the process of increasing the resolution of digital images. While super-resolution can be achieved by passing multiple low-resolution images (reference images) to algorithms, this article mainly focuses on single image supervised super-resolution (SR) techniques. While this problem has been tackled using analytical methods in the Computer Vision community [1][2], recent literature shows an upsurge in the usage of deep learning techniques to perform super-resolution [3][4]. In the field of medical imaging, image resolution is often limited by the constraints on acquisition time, radiation level and hardware costs. Hence, super-resolution techniques come to the rescue, to achieve desirable perceptual quality of the images acquired in such constrained environments.

Google creates tech that lets you enhance zoomed-in images


The system was developed by researchers working on Google Brain. It's based on a pixel recursive super resolution model that allows pixelated, low-resolution images to be dynamically enhanced. It reduces blur, fills in details and eventually pieces together a high-resolution copy. Google Brain uses two neural networks to create the output images. Working with an input file containing 8x8 pixels, it attempts to match the low-resolution source with an existing high-resolution image.