zoom level
Visual place recognition for aerial imagery: A survey
Moskalenko, Ivan, Kornilova, Anastasiia, Ferrer, Gonzalo
Aerial imagery and its direct application to visual localization is an essential problem for many Robotics and Computer Vision tasks. While Global Navigation Satellite Systems (GNSS) are the standard default solution for solving the aerial localization problem, it is subject to a number of limitations, such as, signal instability or solution unreliability that make this option not so desirable. Consequently, visual geolocalization is emerging as a viable alternative. However, adapting Visual Place Recognition (VPR) task to aerial imagery presents significant challenges, including weather variations and repetitive patterns. Current VPR reviews largely neglect the specific context of aerial data. This paper introduces a methodology tailored for evaluating VPR techniques specifically in the domain of aerial imagery, providing a comprehensive assessment of various methods and their performance. However, we not only compare various VPR methods, but also demonstrate the importance of selecting appropriate zoom and overlap levels when constructing map tiles to achieve maximum efficiency of VPR algorithms in the case of aerial imagery. The code is available on our GitHub repository -- https://github.com/prime-slam/aero-vloc.
Mitigating annotation shift in cancer classification using single image generative models
Arcas, Marta Buetas, Osuala, Richard, Lekadir, Karim, Díaz, Oliver
Artificial Intelligence (AI) has emerged as a valuable tool for assisting radiologists in breast cancer detection and diagnosis. However, the success of AI applications in this domain is restricted by the quantity and quality of available data, posing challenges due to limited and costly data annotation procedures that often lead to annotation shifts. This study simulates, analyses and mitigates annotation shifts in cancer classification in the breast mammography domain. First, a high-accuracy cancer risk prediction model is developed, which effectively distinguishes benign from malignant lesions. Next, model performance is used to quantify the impact of annotation shift. We uncover a substantial impact of annotation shift on multiclass classification performance particularly for malignant lesions. We thus propose a training data augmentation approach based on single-image generative models for the affected class, requiring as few as four in-domain annotations to considerably mitigate annotation shift, while also addressing dataset imbalance. Lastly, we further increase performance by proposing and validating an ensemble architecture based on multiple models trained under different data augmentation regimes. Our study offers key insights into annotation shift in deep learning breast cancer classification and explores the potential of single-image generative models to overcome domain shift challenges.
Generative Powers of Ten
Wang, Xiaojuan, Kontkanen, Janne, Curless, Brian, Seitz, Steve, Kemelmacher, Ira, Mildenhall, Ben, Srinivasan, Pratul, Verbin, Dor, Holynski, Aleksander
We present a method that uses a text-to-image model to generate consistent content across multiple image scales, enabling extreme semantic zooms into a scene, e.g., ranging from a wide-angle landscape view of a forest to a macro shot of an insect sitting on one of the tree branches. We achieve this through a joint multi-scale diffusion sampling approach that encourages consistency across different scales while preserving the integrity of each individual sampling process. Since each generated scale is guided by a different text prompt, our method enables deeper levels of zoom than traditional super-resolution methods that may struggle to create new contextual structure at vastly different scales. We compare our method qualitatively with alternative techniques in image super-resolution and outpainting, and show that our method is most effective at generating consistent multi-scale content.
How I built Supervised Skin Lesion Segmentation on HAM10000 Dataset – Towards AI
Originally published on Towards AI. Skin cancer is one of the most common types of cancer in the world. Its early diagnosis is pivotal for eliminating malignant tumors from the human body. There is a lot of ongoing research on skin cancer detection, localization, and classification. Segmentation is an essential step in the localization of skin cancer.
The many layers of data lineage. What can we learn from google maps to…
Having a map showing how data evolves from its sources to its destination is the dream of any organisation. Like the gold rush, everyone is after that tool connecting together columns, tables and dashboards within the warehouse. But like gold, this visualisation has been always considered a privilege in the data ecosystem. Defining the lineage has been a manual task not accessible to everyone. Usually, only the ones working daily with the data transformation processes are aware of the actual flow of data -- and typically this lineage is a mix between what's in their minds, documented information and digging into different tools' metadata.
GANmapper: geographical content filling
Wu, Abraham Noah, Biljecki, Filip
We present a new method to create spatial data using a generative adversarial network (GAN). Our contribution uses coarse and widely available geospatial data to create maps of less available features at the finer scale in the built environment, bypassing their traditional acquisition techniques (e.g. satellite imagery or land surveying). In the work, we employ land use data and road networks as input to generate building footprints, and conduct experiments in 9 cities around the world. The method, which we implement in a tool we release openly, enables generating approximate maps of the urban form, and it is generalisable to augment other types of geoinformation, enhancing the completeness and quality of spatial data infrastructure. It may be especially useful in locations missing detailed and high-resolution data and those that are mapped with uncertain or heterogeneous quality, such as much of OpenStreetMap. The quality of the results is influenced by the urban form and scale. In most cases, experiments suggest promising performance as the method tends to truthfully indicate the locations, amount, and shape of buildings. The work has the potential to support several applications, such as energy, climate, and urban morphology studies in areas previously lacking required data.
Small and large scale critical infrastructures detection based on deep learning using high resolution orthogonal images
Francisco, Pérez-Hernández, José, Rodríguez-Ortega, Yassir, Benhammou, Francisco, Herrera, Siham, Tabik
The detection of critical infrastructures is of high importance in several fields such as security, anomaly detection, land use planning and land use change detection. However, critical infrastructures detection in aerial and satellite images is still a challenge as each one has completely different size and requires different spacial resolution to be identified correctly. Heretofore, there are no special datasets for training critical infrastructures detectors. This paper presents a smart dataset as well as a resolution-independent critical infrastructure detection system. In particular, guided by the performance of the detection model, we built a dataset organized into two scales, small and large scale, and designed a two-stage deep learning detection of different scale critical infrastructures (DetDSCI) methodology in ortho-images. DetDSCI methodology first determines the input image zoom level using a classification model, then analyses the input image with the appropriate scale detection model. Our experiments show that DetDSCI methodology achieves up to 37,53% F1 improvement with respect to the baseline detector.
House Price Prediction using Satellite Imagery
Semnani, Sina Jandaghi, Rezaei, Hoormazd
In this paper we show how using satellite images can improve the accuracy of housing price estimation models. Using Los Angeles County's property assessment dataset, by transferring learning from an Inception-v3 model pretrained on ImageNet, we could achieve an improvement of ~10% in R-squared score compared to two baseline models that only use non-image features of the house.