AITopics | Oehmcke, Stefan

Collaborating Authors

Oehmcke, Stefan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Nacala-Roof-Material: Drone Imagery for Roof Detection, Classification, and Segmentation to Support Mosquito-borne Disease Risk Assessment

Guthula, Venkanna Babu, Oehmcke, Stefan, Chilaule, Remigio, Zhang, Hui, Lang, Nico, Kariryaa, Ankit, Mottelson, Johan, Igel, Christian

arXiv.org Artificial IntelligenceJun-7-2024

As low-quality housing and in particular certain roof characteristics are associated with an increased risk of malaria, classification of roof types based on remote sensing imagery can support the assessment of malaria risk and thereby help prevent the disease. To support research in this area, we release the Nacala-Roof-Material dataset, which contains high-resolution drone images from Mozambique with corresponding labels delineating houses and specifying their roof types. The dataset defines a multi-task computer vision problem, comprising object detection, classification, and segmentation. In addition, we benchmarked various state-of-the-art approaches on the dataset. Canonical U-Nets, YOLOv8, and a custom decoder on pretrained DINOv2 served as baselines. We show that each of the methods has its advantages but none is superior on all tasks, which highlights the potential of our dataset for future research in multi-task learning. While the tasks are closely related, accurate segmentation of objects does not necessarily imply accurate instance separation, and vice versa. We address this general issue by introducing a variant of the deep ordinal watershed (DOW) approach that additionally separates the interior of objects, allowing for improved object delineation and separation. We show that our DOW variant is a generic approach that improves the performance of both U-Net and DINOv2 backbones, leading to a better trade-off between semantic segmentation and instance segmentation.

artificial intelligence, machine learning, segmentation, (17 more...)

arXiv.org Artificial Intelligence

2406.04949

Country: Africa > Mozambique (0.25)

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.88)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases > Vector-Borne Disease (0.83)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Add feedback

MMEarth: Exploring Multi-Modal Pretext Tasks For Geospatial Representation Learning

Nedungadi, Vishal, Kariryaa, Ankit, Oehmcke, Stefan, Belongie, Serge, Igel, Christian, Lang, Nico

arXiv.org Artificial IntelligenceMay-4-2024

The volume of unlabelled Earth observation (EO) data is huge, but many important applications lack labelled training data. However, EO data offers the unique opportunity to pair data from different modalities and sensors automatically based on geographic location and time, at virtually no human labor cost. We seize this opportunity to create a diverse multi-modal pretraining dataset at global scale. Using this new corpus of 1.2 million locations, we propose a Multi-Pretext Masked Autoencoder (MP-MAE) approach to learn general-purpose representations for optical satellite images. Our approach builds on the ConvNeXt V2 architecture, a fully convolutional masked autoencoder (MAE). Drawing upon a suite of multi-modal pretext tasks, we demonstrate that our MP-MAE approach outperforms both MAEs pretrained on ImageNet and MAEs pretrained on domain-specific satellite images. This is shown on several downstream tasks including image classification and semantic segmentation. We find that multi-modal pretraining notably improves the linear probing performance, e.g. 4pp on BigEarthNet and 16pp on So2Sat, compared to pretraining on optical satellite images only. We show that this also leads to better label and parameter efficiency which are crucial aspects in global scale applications.

artificial intelligence, machine learning, representation, (13 more...)

arXiv.org Artificial Intelligence

2405.02771

Country:

Africa (0.28)
Europe (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Energy > Renewable (0.69)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Deep Learning Based 3D Point Cloud Regression for Estimating Forest Biomass

Oehmcke, Stefan, Li, Lei, Trepekli, Katerina, Revenga, Jaime, Nord-Larsen, Thomas, Gieseke, Fabian, Igel, Christian

arXiv.org Artificial IntelligenceFeb-21-2023

Robust quantification of forest carbon stocks and their dynamics is important for climate change mitigation and adaptation strategies [FAO and UNEP, 2020]. The Paris Agreement [United Nations / Framework Convention on Climate Change, 2015] and the IPCC [Shukla et al., 2019] acknowledge that climate change mitigation goals cannot be achieved without a substantial contribution from forests. Spatial details in the carbon budget of forests are necessary to encourage transformational actions towards a sustainable forest sector [Harris et al., 2021, 2012]. Currently, many countries do not have nationally specific forest carbon accumulation rates but rather rely on default rates from the IPCC 2018 [Masson-Delmotte et al., 2019, Requena Suarez et al., 2019]), without accounting for finer-scale variations of carbon stocks [Cook-Patton et al., 2020]. Precise spatio-temporal monitoring of forest carbon dynamics at large scales has proven to be challenging [Erb et al., 2018, Griscom et al., 2017]. This is due to the complex structure of forests, topographic features, and land management practices [Tubiello et al., 2021, Lewis et al., 2019]. Technological developments in remote sensing and the concurrent increased availability of field-based measurements have led to an improvement in estimating carbon stocks using remote sensing observations of forest attributes that serve as proxy for above-ground biomass (AGB) [Knapp et al., 2018, Bouvier et al., 2015, Pan et al., 2013]. Currently, three remote sensing techniques are applied to collect data for AGB estimates: i) passive optical imagery, ii) synthetic aperture radar (SAR), and iii) light detection and ranging (LiDAR).

artificial intelligence, machine learning, point cloud, (18 more...)

arXiv.org Artificial Intelligence

2112.11335

Country:

Europe (0.69)
North America > United States (0.67)

Genre: Research Report (1.00)

Industry:

Government (1.00)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.76)
Law > Environmental Law (0.54)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning Selection Masks for Deep Neural Networks

Oehmcke, Stefan, Gieseke, Fabian

arXiv.org Machine LearningJun-11-2019

Data have often to be moved between servers and clients during the inference phase. For instance, modern virtual assistants collect data on mobile devices and the data are sent to remote servers for the analysis. A related scenario is that clients have to access and download large amounts of data stored on servers in order to apply machine learning models. Depending on the available bandwidth, this data transfer can be a serious bottleneck, which can significantly limit the application machine learning models. In this work, we propose a simple yet effective framework that allows to select certain parts of the input data needed for the subsequent application of a given neural network. Both the masks as well as the neural network are trained simultaneously such that a good model performance is achieved while, at the same time, only a minimal amount of data is selected by the masks. During the inference phase, only the parts selected by the masks have to be transferred between the server and the client. Our experimental evaluation indicates that it is, for certain learning tasks, possible to significantly reduce the amount of data needed to be transferred without affecting the model performance much.

deep learning, neural network, pixel, (19 more...)

arXiv.org Machine Learning

1906.04673

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback