Goto

Collaborating Authors

 deforestation


FINDER: Feature Inference on Noisy Datasets using Eigenspace Residuals

Murphy, Trajan, Dogra, Akshunna S., Gu, Hanfeng, Meredith, Caleb, Kon, Mark, Castrillion-Candas, Julio Enrique

arXiv.org Artificial Intelligence

''Noisy'' datasets (regimes with low signal to noise ratios, small sample sizes, faulty data collection, etc) remain a key research frontier for classification methods with both theoretical and practical implications. We introduce FINDER, a rigorous framework for analyzing generic classification problems, with tailored algorithms for noisy datasets. FINDER incorporates fundamental stochastic analysis ideas into the feature learning and inference stages to optimally account for the randomness inherent to all empirical datasets. We construct ''stochastic features'' by first viewing empirical datasets as realizations from an underlying random field (without assumptions on its exact distribution) and then mapping them to appropriate Hilbert spaces. The Kosambi-Karhunen-Loéve expansion (KLE) breaks these stochastic features into computable irreducible components, which allow classification over noisy datasets via an eigen-decomposition: data from different classes resides in distinct regions, identified by analyzing the spectrum of the associated operators. We validate FINDER on several challenging, data-deficient scientific domains, producing state of the art breakthroughs in: (i) Alzheimer's Disease stage classification, (ii) Remote sensing detection of deforestation. We end with a discussion on when FINDER is expected to outperform existing methods, its failure modes, and other limitations.


deFOREST: Fusing Optical and Radar satellite data for Enhanced Sensing of Tree-loss

Castrillon-Candas, Julio Enrique, Gu, Hanfeng, Meredith, Caleb, Li, Yulin, Tang, Xiaojing, Olofsson, Pontus, Kon, Mark

arXiv.org Machine Learning

In this paper we develop a deforestation detection pipeline that incorporates optical and Synthetic Aperture Radar (SAR) data. A crucial component of the pipeline is the construction of anomaly maps of the optical data, which is done using the residual space of a discrete Karhunen-Loève (KL) expansion. Anomalies are quantified using a concentration bound on the distribution of the residual components for the nominal state of the forest. This bound does not require prior knowledge on the distribution of the data. This is in contrast to statistical parametric methods that assume knowledge of the data distribution, an impractical assumption that is especially infeasible for high dimensional data such as ours. Once the optical anomaly maps are computed they are combined with SAR data, and the state of the forest is classified by using a Hidden Markov Model (HMM). We test our approach with Sentinel-1 (SAR) and Sentinel-2 (Optical) data on a $92.19\,km \times 91.80\,km$ region in the Amazon forest. The results show that both the hybrid optical-radar and optical only methods achieve high accuracy that is superior to the recent state-of-the-art hybrid method. Moreover, the hybrid method is significantly more robust in the case of sparse optical data that are common in highly cloudy regions.


Annotating Satellite Images of Forests with Keywords from a Specialized Corpus in the Context of Change Detection

Neptune, Nathalie, Mothe, Josiane

arXiv.org Artificial Intelligence

The Amazon rain forest is a vital ecosystem that plays a crucial role in regulating the Earth's climate and providing habitat for countless species. Deforestation in the Amazon is a major concern as it has a significant impact on global carbon emissions and biodiversity. In this paper, we present a method for detecting deforestation in the Amazon using image pairs from Earth observation satellites. Our method leverages deep learning techniques to compare the images of the same area at different dates and identify changes in the forest cover. We also propose a visual semantic model that automatically annotates the detected changes with relevant keywords. The candidate annotation for images are extracted from scientific documents related to the Amazon region. We evaluate our approach on a dataset of Amazon image pairs and demonstrate its effectiveness in detecting deforestation and generating relevant annotations. Our method provides a useful tool for monitoring and studying the impact of deforestation in the Amazon. While we focus on environment applications of our work by using images of deforestation in the Amazon rain forest to demonstrate the effectiveness of our proposed approach, it is generic enough to be applied to other domains.


Stochastic forest transition model dynamics and parameter estimation via deep learning

Kumabe, Satoshi, Song, Tianyu, Ta, Ton Viet

arXiv.org Machine Learning

Forest transitions, characterized by dynamic shifts between forest, agricultural, and abandoned lands, are complex phenomena. This study developed a stochastic differential equation model to capture the intricate dynamics of these transitions. We established the existence of global positive solutions for the model and conducted numerical analyses to assess the impact of model parameters on deforestation incentives. To address the challenge of parameter estimation, we proposed a novel deep learning approach that estimates all model parameters from a single sample containing time-series observations of forest and agricultural land proportions. This innovative approach enables us to understand forest transition dynamics and deforestation trends at any future time.


Scaling Deep Learning Research with Kubernetes on the NRP Nautilus HyperCluster

Hurt, J. Alex, Ouadou, Anes, Alshehri, Mariam, Scott, Grant J.

arXiv.org Artificial Intelligence

Throughout the scientific computing space, deep learning algorithms have shown excellent performance in a wide range of applications. As these deep neural networks (DNNs) continue to mature, the necessary compute required to train them has continued to grow. Today, modern DNNs require millions of FLOPs and days to weeks of training to generate a well-trained model. The training times required for DNNs are oftentimes a bottleneck in DNN research for a variety of deep learning applications, and as such, accelerating and scaling DNN training enables more robust and accelerated research. To that end, in this work, we explore utilizing the NRP Nautilus HyperCluster to automate and scale deep learning model training for three separate applications of DNNs, including overhead object detection, burned area segmentation, and deforestation detection. In total, 234 deep neural models are trained on Nautilus, for a total time of 4,040 hours. Deep convolutional neural networks (DCNNs) have been established as the state of the art in computer vision (CV) and have shown superior performance in visual tasks for many domains, including remote sensing. With billions of pixels being collected by overhead sources like satellites, remote sensing (RS) is becoming evermore a big-data problem domain, with endless amounts of data available to enable CV applications. Due in part to this data availability, the training and optimization of deep networks for RS applications has been explored to great lengths in recent years. In 2017, researchers investigated utilizing DCNNs for land-cover classification in overhead imagery along with techniques such as transfer learning and data augmentation[1]. This work was then extended into multi-network fusion research, where multiple DCNNs trained on overhead satellite imagery were fused using simple fusion techniques such as voting and arrogance [2] and then compared to more complex fusion algorithms such as the Choquet and Sugeno Fuzzy Integral [3], [4]. While these studies explored utilizing DCNNs to perform classification on overhead RS imagery, further exploration was required in broad area search, in which DCNNs are trained and used not on clean pre-processed datasets, but instead applied to large swaths of overhead imagery with the goal of finding all instances of a given object or terrain.


Sampling Strategies based on Wisdom of Crowds for Amazon Deforestation Detection

Resende, Hugo, Neto, Eduardo B., Cappabianco, Fabio A. M., Fazenda, Alvaro L., Faria, Fabio A.

arXiv.org Artificial Intelligence

Conserving tropical forests is highly relevant socially and ecologically because of their critical role in the global ecosystem. However, the ongoing deforestation and degradation affect millions of hectares each year, necessitating government or private initiatives to ensure effective forest monitoring. In April 2019, a project based on Citizen Science and Machine Learning models called ForestEyes (FE) was launched with the aim of providing supplementary data to assist experts from government and non-profit organizations in their deforestation monitoring efforts. Recent research has shown that labeling FE project volunteers/citizen scientists helps tailor machine learning models. In this sense, we adopt the FE project to create different sampling strategies based on the wisdom of crowds to select the most suitable samples from the training set to learn an SVM technique and obtain better classification results in deforestation detection tasks. In our experiments, we can show that our strategy based on user entropy-increasing achieved the best classification results in the deforestation detection task when compared with the random sampling strategies, as well as, reducing the convergence time of the SVM technique.


ForestEyes: Citizen Scientists and Machine Learning-Assisting Rainforest Conservation

Communications of the ACM

Citizen Science (CS) leverages the collective efforts of non-specialist/ordinary volunteers in different research tasks, such as collecting, analyzing, and classifying data to solve technical and scientific challenges. CS applications have attracted the attention of academic researchers due to the abundance of data created with high quality at low cost. According to an article in CERN Courier Magazine,3 CS is beneficial for the scientific community, the volunteers involved in the projects, and society as a whole. On the researcher's side, CS helps to achieve scientific data/metadata quickly, obtaining large amounts of valuable information that can contribute to advancing research.3 On the other hand, volunteers become aware of a scientific methodology, are recognized for their contributions, and feel satisfied for being part of a project with scientific and social relevance.2


Are seed-sowing drones the answer to global deforestation?

Al Jazeera

Santa Cruz Cabralia, Bahia, Brazil – With a loud whir, the drone takes flight. Minutes later, the humming sound gives way to a distinctive rattling as the machine, hovering about 20 metres above the ground, begins unloading its precious cargo and a cocktail of seeds rains down onto the land below. Given time, these seeds will grow into trees and, eventually, it is hoped, a thriving forest will stand where there was once just sparse vegetation. That is what the startup which operates this drone, a large contraption that looks a bit like a Pokemon ball with antennae, hopes. The 54 hectares (133 acres) here which have been badly degraded by agriculture and cattle farming in the Brazilian state of Bahia are just the start.


A community palm model

Clinton, Nicholas, Vollrath, Andreas, D'annunzio, Remi, Liu, Desheng, Glick, Henry B., Descals, Adrià, Sullivan, Alicia, Guinan, Oliver, Abramowitz, Jacob, Stolle, Fred, Goodman, Chris, Birch, Tanya, Quinn, David, Danylo, Olga, Lips, Tijs, Coelho, Daniel, Bihari, Enikoe, Cronkite-Ratcliff, Bryce, Poortinga, Ate, Haghighattalab, Atena, Notman, Evan, DeWitt, Michael, Yonas, Aaron, Donchyts, Gennadii, Shah, Devaja, Saah, David, Tenneson, Karis, Quyen, Nguyen Hanh, Verma, Megha, Wilcox, Andrew

arXiv.org Artificial Intelligence

Palm oil production has been identified as one of the major drivers of deforestation for tropical countries. To meet supply chain objectives, commodity producers and other stakeholders need timely information of land cover dynamics in their supply shed. However, such data are difficult to obtain from suppliers who may lack digital geographic representations of their supply sheds and production locations. Here we present a "community model," a machine learning model trained on pooled data sourced from many different stakeholders, to develop a specific land cover probability map, in this case a semi-global oil palm map. An advantage of this method is the inclusion of varied inputs, the ability to easily update the model as new training data becomes available and run the model on any year that input imagery is available. Inclusion of diverse data sources into one probability map can help establish a shared understanding across stakeholders on the presence and absence of a land cover or commodity (in this case oil palm). The model predictors are annual composites built from publicly available satellite imagery provided by Sentinel-1, Sentinel-2, and ALOS DSM. We provide map outputs as the probability of palm in a given pixel, to reflect the uncertainty of the underlying state (palm or not palm). The initial version of this model provides global accuracy estimated to be approximately 90% (at 0.5 probability threshold) from spatially partitioned test data. This model, and resulting oil palm probability map products are useful for accurately identifying the geographic footprint of palm cultivation. Used in conjunction with timely deforestation information, this palm model is useful for understanding the risk of continued oil palm plantation expansion in sensitive forest areas.


Assessing the Potential of AI for Spatially Sensitive Nature-Related Financial Risks

Reece, Steven, O'Donnell, Emma, Liu, Felicia, Wolstenholme, Joanna, Arriaga, Frida, Ascenzi, Giacomo, Pywell, Richard

arXiv.org Artificial Intelligence

There is growing recognition among financial institutions, financial regulators and policy makers of the importance of addressing nature-related risks and opportunities. Evaluating and assessing nature-related risks for financial institutions is challenging due to the large volume of heterogeneous data available on nature and the complexity of investment value chains and the various components' relationship to nature. The dual problem of scaling data analytics and analysing complex systems can be addressed using Artificial Intelligence (AI). We address issues such as plugging existing data gaps with discovered data, data estimation under uncertainty, time series analysis and (near) real-time updates. This report presents potential AI solutions for models of two distinct use cases, the Brazil Beef Supply Use Case and the Water Utility Use Case. Our two use cases cover a broad perspective within sustainable finance. The Brazilian cattle farming use case is an example of greening finance - integrating nature-related considerations into mainstream financial decision-making to transition investments away from sectors with poor historical track records and unsustainable operations. The deployment of nature-based solutions in the UK water utility use case is an example of financing green - driving investment to nature-positive outcomes. The two use cases also cover different sectors, geographies, financial assets and AI modelling techniques, providing an overview on how AI could be applied to different challenges relating to nature's integration into finance. This report is primarily aimed at financial institutions but is also of interest to ESG data providers, TNFD, systems modellers, and, of course, AI practitioners.