Schmitt, Michael
Distribution Shifts at Scale: Out-of-distribution Detection in Earth Observation
Ekim, Burak, Tadesse, Girmaw Abebe, Robinson, Caleb, Hacheme, Gilles, Schmitt, Michael, Dodhia, Rahul, Ferres, Juan M. Lavista
Training robust deep learning models is critical in Earth Observation, where globally deployed models often face distribution shifts that degrade performance, especially in low-data regions. Out-of-distribution (OOD) detection addresses this challenge by identifying inputs that differ from in-distribution (ID) data. However, existing methods either assume access to OOD data or compromise primary task performance, making them unsuitable for real-world deployment. We propose TARDIS, a post-hoc OOD detection method for scalable geospatial deployments. The core novelty lies in generating surrogate labels by integrating information from ID data and unknown distributions, enabling OOD detection at scale. Our method takes a pre-trained model, ID data, and WILD samples, disentangling the latter into surrogate ID and surrogate OOD labels based on internal activations, and fits a binary classifier as an OOD detector. We validate TARDIS on EuroSAT and xBD datasets, across 17 experimental setups covering covariate and semantic shifts, showing that it performs close to the theoretical upper bound in assigning surrogate ID and OOD samples in 13 cases. To demonstrate scalability, we deploy TARDIS on the Fields of the World dataset, offering actionable insights into pre-trained model behavior for large-scale deployments. The code is publicly available at https://github.com/microsoft/geospatial-ood-detection.
There Are No Data Like More Data- Datasets for Deep Learning in Earth Observation
Schmitt, Michael, Ahmadi, Seyed Ali, Xu, Yonghao, Taskin, Gulsen, Verma, Ujjwal, Sica, Francescopaolo, Hansch, Ronny
Carefully curated and annotated datasets are the foundation of machine learning, with particularly data-hungry deep neural networks forming the core of what is often called Artificial Intelligence (AI). Due to the massive success of deep learning applied to Earth Observation (EO) problems, the focus of the community has been largely on the development of ever-more sophisticated deep neural network architectures and training strategies largely ignoring the overall importance of datasets. For that purpose, numerous task-specific datasets have been created that were largely ignored by previously published review articles on AI for Earth observation. With this article, we want to change the perspective and put machine learning datasets dedicated to Earth observation data and applications into the spotlight. Based on a review of the historical developments, currently available resources are described and a perspective for future developments is formed. We hope to contribute to an understanding that the nature of our data is what distinguishes the Earth observation community from many other communities that apply deep learning techniques to image data, and that a detailed understanding of EO data peculiarities is among the core competencies of our discipline.
Explaining Multimodal Data Fusion: Occlusion Analysis for Wilderness Mapping
Ekim, Burak, Schmitt, Michael
Jointly harnessing complementary features of multi-modal input data in a common latent space has been found to be beneficial long ago. However, the influence of each modality on the models decision remains a puzzle. This study proposes a deep learning framework for the modality-level interpretation of multimodal earth observation data in an end-to-end fashion. While leveraging an explainable machine learning method, namely Occlusion Sensitivity, the proposed framework investigates the influence of modalities under an early-fusion scenario in which the modalities are fused before the learning process. We show that the task of wilderness mapping largely benefits from auxiliary data such as land cover and night time light data.
MapInWild: A Remote Sensing Dataset to Address the Question What Makes Nature Wild
Ekim, Burak, Stomberg, Timo T., Roscher, Ribana, Schmitt, Michael
I. INTRODUCTION The advancement in deep learning (DL) techniques has led to a notable increase in the number and size of annotated datasets in a variety of domains, with remote sensing (RS) being no exception [1]. Also, an increase in earth observation (EO) missions and easy access to globally available and free geodata have opened up new research opportunities. Although numerous RS datasets have been published in the past years [2]-[6], most of them addressed tasks concerning man-made environments such as building footprint extraction and road network classification, leaving the environmental and ecology-related sub-areas of remote sensing underrepresented. The ESA WorldCover map legend is given below the figure. In this community, the classification task can be machine learning model in the form of deep neural networks. While some methods frame the RS-related classification (usually called semantic segmentation by tasks within the context of perturbation-seeking generative the computer vision community) the task outputs denselyannotated adversarial networks [14], some others made use of uncertainty prediction maps on a pixel scale by separating the estimation applied to deep ensembles [15] and self-attention input into distinct and semantically coherent segments.
On the Accuracy of Bounded Rationality: How Far from Optimal Is Fast and Frugal?
Schmitt, Michael, Martignon, Laura
Fast and frugal heuristics are well studied models of bounded rationality. Psychologicalresearch has proposed the take-the-best heuristic as a successful strategy in decision making with limited resources. Take-thebest searchesfor a sufficiently good ordering of cues (features) in a task where objects are to be compared lexicographically. We investigate the complexity of the problem of approximating optimal cue permutations for lexicographic strategies. We show that no efficient algorithm can approximate theoptimum to within any constant factor, if P NP. We further consider a greedy approach for building lexicographic strategies and derive tight bounds for the performance ratio of a new and simple algorithm. This algorithm is proven to perform better than take-the-best.
Lower Bounds on the Complexity of Approximating Continuous Functions by Sigmoidal Neural Networks
Schmitt, Michael
This is one of the theoretical results most frequently cited to justify the use of sigmoidal neural networks in applications. By this statement one refers to the fact that sigmoidal neural networks have been shown to be able to approximate any continuous function arbitrarily well. Numerous results in the literature have established variants of this universal approximation property by considering distinct function classes to be approximated by network architectures using different types of neural activation functions with respect to various approximation criteria, see for instance [1, 2, 3, 5, 6, 11, 12, 14, 15].
Lower Bounds on the Complexity of Approximating Continuous Functions by Sigmoidal Neural Networks
Schmitt, Michael
This is one of the theoretical results most frequently cited to justify the use of sigmoidal neural networks in applications. By this statement one refers to the fact that sigmoidal neural networks have been shown to be able to approximate any continuous function arbitrarily well. Numerous results in the literature have established variants of this universal approximation property by considering distinct function classes to be approximated by network architectures using different types of neural activation functions with respect to various approximation criteria, see for instance [1, 2, 3, 5, 6, 11, 12, 14, 15].