Daudt, Rodrigo Caye
Recognition of Unseen Bird Species by Learning from Field Guides
Rodríguez, Andrés C., D'Aronco, Stefano, Daudt, Rodrigo Caye, Wegner, Jan D., Schindler, Konrad
We exploit field guides to learn bird species recognition, in particular zero-shot recognition of unseen species. Illustrations contained in field guides deliberately focus on discriminative properties of each species, and can serve as side information to transfer knowledge from seen to unseen bird species. We study two approaches: (1) a contrastive encoding of illustrations, which can be fed into standard zero-shot learning schemes; and (2) a novel method that leverages the fact that illustrations are also images and as such structurally more similar to photographs than other kinds of side information. Our results show that illustrations from field guides, which are readily available for a wide range of species, are indeed a competitive source of side information for zero-shot learning. On a subset of the iNaturalist2021 dataset with 749 seen and 739 unseen species, we obtain a classification accuracy of unseen bird species of $12\%$ @top-1 and $38\%$ @top-10, which shows the potential of field guides for challenging real-world scenarios with many species. Our code is available at https://github.com/ac-rodriguez/zsl_billow
Satellite-based high-resolution maps of cocoa planted area for C\^ote d'Ivoire and Ghana
Kalischek, Nikolai, Lang, Nico, Renier, Cécile, Daudt, Rodrigo Caye, Addoah, Thomas, Thompson, William, Blaser-Hart, Wilma J., Garrett, Rachael, Schindler, Konrad, Wegner, Jan D.
In both countries, cocoa is the primary perennial crop, providing income to almost two million farmers. Yet precise maps of cocoa planted area are missing, hindering accurate quantification of expansion in protected areas, production and yields, and limiting information available for improved sustainability governance. Here, we combine cocoa plantation data with publicly available satellite imagery in a deep learning framework and create high-resolution maps of cocoa plantations for both countries, validated in situ. Our results suggest that cocoa cultivation is an underlying driver of over 37 % and 13 % of forest loss in protected areas in Côte d'Ivoire and Ghana, respectively, and that official reports substantially underestimate the planted area, up to 40 % in Ghana. These maps serve as a crucial building block to advance understanding of conservation and economic development in cocoa producing regions.
Guided Depth Super-Resolution by Deep Anisotropic Diffusion
Metzger, Nando, Daudt, Rodrigo Caye, Schindler, Konrad
Performing super-resolution of a depth image using the guidance from an RGB image is a problem that concerns several fields, such as robotics, medical imaging, and remote sensing. While deep learning methods have achieved good results in this problem, recent work highlighted the value of combining modern methods with more formal frameworks. In this work, we propose a novel approach which combines guided anisotropic diffusion with a deep convolutional network and advances the state of the art for guided depth super-resolution. The edge transferring/enhancing properties of the diffusion are boosted by the contextual reasoning capabilities of modern networks, and a strict adjustment step guarantees perfect adherence to the source image. We achieve unprecedented results in three commonly used benchmarks for guided depth super-resolution. The performance gain compared to other methods is the largest at larger scales, such as x32 scaling. Code (https://github.com/prs-eth/Diffusion-Super-Resolution) for the proposed method is available to promote reproducibility of our results.
FiLM-Ensemble: Probabilistic Deep Learning via Feature-wise Linear Modulation
Turkoglu, Mehmet Ozgur, Becker, Alexander, Gündüz, Hüseyin Anil, Rezaei, Mina, Bischl, Bernd, Daudt, Rodrigo Caye, D'Aronco, Stefano, Wegner, Jan Dirk, Schindler, Konrad
The ability to estimate epistemic uncertainty is often crucial when deploying machine learning in the real world, but modern methods often produce overconfident, uncalibrated uncertainty predictions. A common approach to quantify epistemic uncertainty, usable across a wide class of prediction models, is to train a model ensemble. In a naive implementation, the ensemble approach has high computational cost and high memory demand. This challenges in particular modern deep learning, where even a single deep network is already demanding in terms of compute and memory, and has given rise to a number of attempts to emulate the model ensemble without actually instantiating separate ensemble members. We introduce FiLM-Ensemble, a deep, implicit ensemble method based on the concept of Feature-wise Linear Modulation (FiLM). That technique was originally developed for multi-task learning, with the aim of decoupling different tasks. We show that the idea can be extended to uncertainty quantification: by modulating the network activations of a single deep network with FiLM, one obtains a model ensemble with high diversity, and consequently well-calibrated estimates of epistemic uncertainty, with low computational overhead in comparison. Empirically, FiLM-Ensemble outperforms other implicit ensemble methods, and it and comes very close to the upper bound of an explicit ensemble of networks (sometimes even beating it), at a fraction of the memory cost.