Goto

Collaborating Authors

 Marvasti-Zadeh, Seyed Mojtaba


Early Detection of Bark Beetle Attack Using Remote Sensing and Machine Learning: A Review

arXiv.org Artificial Intelligence

This paper provides a comprehensive review of past and current advances in the early detection of bark beetle-induced tree mortality from three primary perspectives: bark beetle & host interactions, RS, and ML/DL. In contrast to prior efforts, this review encompasses all RS systems and emphasizes ML/DL methods to investigate their strengths and weaknesses. We parse existing literature based on multi- or hyper-spectral analyses and distill their knowledge based on: bark beetle species & attack phases with a primary emphasis on early stages of attacks, host trees, study regions, RS platforms & sensors, spectral/spatial/temporal resolutions, spectral signatures, spectral vegetation indices (SVIs), ML approaches, learning schemes, task categories, models, algorithms, classes/clusters, features, and DL networks & architectures. Although DL-based methods and the random forest (RF) algorithm showed promising results, highlighting their potential to detect subtle changes across visible, thermal, and short-wave infrared (SWIR) spectral regions, they still have limited effectiveness and high uncertainties. To inspire novel solutions to these shortcomings, we delve into the principal challenges & opportunities from different perspectives, enabling a deeper understanding of the current state of research and guiding future research directions.


Training-based Model Refinement and Representation Disagreement for Semi-Supervised Object Detection

arXiv.org Artificial Intelligence

Semi-supervised object detection (SSOD) aims to improve the performance and generalization of existing object detectors by utilizing limited labeled data and extensive unlabeled data. Despite many advances, recent SSOD methods are still challenged by inadequate model refinement using the classical exponential moving average (EMA) strategy, the consensus of Teacher-Student models in the latter stages of training (i.e., losing their distinctiveness), and noisy/misleading pseudo-labels. This paper proposes a novel training-based model refinement (TMR) stage and a simple yet effective representation disagreement (RD) strategy to address the limitations of classical EMA and the consensus problem. The TMR stage of Teacher-Student models optimizes the lightweight scaling operation to refine the model's weights and prevent overfitting or forgetting learned patterns from unlabeled data. Meanwhile, the RD strategy helps keep these models diverged to encourage the student model to explore additional patterns in unlabeled data. Our approach can be integrated into established SSOD methods and is empirically validated using two baseline methods, with and without cascade regression, to generate more reliable pseudo-labels. Extensive experiments demonstrate the superior performance of our approach over state-of-the-art SSOD methods. Specifically, the proposed approach outperforms the baseline Unbiased-Teacher-v2 (& Unbiased-Teacher-v1) method by an average mAP margin of 2.23, 2.1, and 3.36 (& 2.07, 1.9, and 3.27) on COCO-standard, COCO-additional, and Pascal VOC datasets, respectively.


Crown-CAM: Interpretable Visual Explanations for Tree Crown Detection in Aerial Images

arXiv.org Artificial Intelligence

Visual explanation of ``black-box'' models allows researchers in explainable artificial intelligence (XAI) to interpret the model's decisions in a human-understandable manner. In this paper, we propose interpretable class activation mapping for tree crown detection (Crown-CAM) that overcomes inaccurate localization & computational complexity of previous methods while generating reliable visual explanations for the challenging and dynamic problem of tree crown detection in aerial images. It consists of an unsupervised selection of activation maps, computation of local score maps, and non-contextual background suppression to efficiently provide fine-grain localization of tree crowns in scenarios with dense forest trees or scenes without tree crowns. Additionally, two Intersection over Union (IoU)-based metrics are introduced to effectively quantify both the accuracy and inaccuracy of generated explanations with respect to regions with or even without tree crowns in the image. Empirical evaluations demonstrate that the proposed Crown-CAM outperforms the Score-CAM, Augmented Score-CAM, and Eigen-CAM methods by an average IoU margin of 8.7, 5.3, and 21.7 (and 3.3, 9.8, and 16.5) respectively in improving the accuracy (and decreasing inaccuracy) of visual explanations on the challenging NEON tree crown dataset.


CHASE: Robust Visual Tracking via Cell-Level Differentiable Neural Architecture Search

arXiv.org Artificial Intelligence

A strong visual object tracker nowadays relies on its well-crafted modules, which typically consist of manually-designed network architectures to deliver high-quality tracking results. Not surprisingly, the manual design process becomes a particularly challenging barrier, as it demands sufficient prior experience, enormous effort, intuition and perhaps some good luck. Meanwhile, neural architecture search has gaining grounds in practical applications such as image segmentation, as a promising method in tackling the issue of automated search of feasible network structures. In this work, we propose a novel cell-level differentiable architecture search mechanism to automate the network design of the tracking module, aiming to adapt backbone features to the objective of a tracking network during offline training. The proposed approach is simple, efficient, and with no need to stack a series of modules to construct a network. Our approach is easy to be incorporated into existing trackers, which is empirically validated using different differentiable architecture search-based methods and tracking objectives. Extensive experimental evaluations demonstrate the superior performance of our approach over five commonly-used benchmarks. Meanwhile, our automated searching process takes 41 (18) hours for the second (first) order DARTS method on the TrackingNet dataset.


Adaptive Exploitation of Pre-trained Deep Convolutional Neural Networks for Robust Visual Tracking

arXiv.org Artificial Intelligence

Due to the automatic feature extraction procedure via multi-layer nonlinear transformations, the deep learning-based visual trackers have recently achieved great success in challenging scenarios for visual tracking purposes. Although many of those trackers utilize the feature maps from pre-trained convolutional neural networks (CNNs), the effects of selecting different models and exploiting various combinations of their feature maps are still not compared completely. To the best of our knowledge, all those methods use a fixed number of convolutional feature maps without considering the scene attributes (e.g., occlusion, deformation, and fast motion) that might occur during tracking. As a pre-requisition, this paper proposes adaptive discriminative correlation filters (DCF) based on the methods that can exploit CNN models with different topologies. First, the paper provides a comprehensive analysis of four commonly used CNN models to determine the best feature maps of each model. Second, with the aid of analysis results as attribute dictionaries, adaptive exploitation of deep features is proposed to improve the accuracy and robustness of visual trackers regarding video characteristics. Third, the generalization of the proposed method is validated on various tracking datasets as well as CNN models with similar architectures. Finally, extensive experimental results demonstrate the effectiveness of the proposed adaptive method compared with state-of-the-art visual tracking methods.