Pizarro, Oscar
Detecting Endangered Marine Species in Autonomous Underwater Vehicle Imagery Using Point Annotations and Few-Shot Learning
Doig, Heather, Pizarro, Oscar, Monk, Jacquomo, Williams, Stefan
One use of Autonomous Underwater Vehicles (AUVs) is the monitoring of habitats associated with threatened, endangered and protected marine species, such as the handfish of Tasmania, Australia. Seafloor imagery collected by AUVs can be used to identify individuals within their broader habitat context, but the sheer volume of imagery collected can overwhelm efforts to locate rare or cryptic individuals. Machine learning models can be used to identify the presence of a particular species in images using a trained object detector, but the lack of training examples reduces detection performance, particularly for rare species that may only have a small number of examples in the wild. In this paper, inspired by recent work in few-shot learning, images and annotations of common marine species are exploited to enhance the ability of the detector to identify rare and cryptic species. Annotated images of six common marine species are used in two ways. Firstly, the common species are used in a pre-training step to allow the backbone to create rich features for marine species. Secondly, a copy-paste operation is used with the common species images to augment the training data. While annotations for more common marine species are available in public datasets, they are often in point format, which is unsuitable for training an object detector. A popular semantic segmentation model efficiently generates bounding box annotations for training from the available point annotations. Our proposed framework is applied to AUV images of handfish, increasing average precision by up to 48\% compared to baseline object detection training. This approach can be applied to other objects with low numbers of annotations and promises to increase the ability to actively monitor threatened, endangered and protected species.
Feature Alignment: Rethinking Efficient Active Learning via Proxy in the Context of Pre-trained Models
Wen, Ziting, Pizarro, Oscar, Williams, Stefan
Fine-tuning the pre-trained model with active learning holds promise for reducing annotation costs. However, this combination introduces significant computational costs, particularly with the growing scale of pre-trained models. Recent research has proposed proxy-based active learning, which pre-computes features to reduce computational costs. Yet, this approach often incurs a significant loss in active learning performance, which may even outweigh the computational cost savings. In this paper, we argue the performance drop stems not only from pre-computed features' inability to distinguish between categories of labeled samples, resulting in the selection of redundant samples but also from the tendency to compromise valuable pre-trained information when fine-tuning with samples selected through the proxy model. To address this issue, we propose a novel method called aligned selection via proxy to update pre-computed features while selecting a proper training method to inherit valuable pre-training information. Extensive experiments validate that our method significantly improves the total cost of efficient active learning while maintaining computational efficiency.
Feature Space Exploration For Planning Initial Benthic AUV Surveys
Shields, Jackson, Pizarro, Oscar, Williams, Stefan B.
Special-purpose Autonomous Underwater Vehicles (AUVs) are utilised for benthic (seafloor) surveys, where the vehicle collects optical imagery of the seafloor. Due to the small-sensor footprint of the cameras and the vast areas to be surveyed, these AUVs can not feasibly collect full coverage imagery of areas larger than a few tens of thousands of square meters. Therefore it is necessary for AUV paths to sample the surveys areas sparsely, yet effectively. Broad-scale acoustic bathymetric data is readily available over large areas, and is often a useful prior of seafloor cover. As such, prior bathymetry can be used to guide AUV data collection. This research proposes methods for planning initial AUV surveys that efficiently explore a feature space representation of the bathymetry, in order to sample from a diverse set of bathymetric terrain. This will enable the AUV to visit areas that likely contain unique habitats and are representative of the entire survey site. We propose several information gathering planners that utilise a feature space exploration reward, to plan freeform paths or to optimise the placement of a survey template. The suitability of these methods to plan AUV surveys is evaluated based on the coverage of the feature space and also the ability to visit all classes of benthic habitat on the initial dive. Informative planners based on Rapidly-expanding Random Trees (RRT) and Monte-Carlo Tree Search (MCTS) were found to be the most effective. This is a valuable tool for AUV surveys as it increases the utility of initial dives. It also delivers a comprehensive training set to learn a relationship between acoustic bathymetry and visually-derived seafloor classifications.
A Semi-supervised Object Detection Algorithm for Underwater Imagery
Bijjahalli, Suraj, Pizarro, Oscar, Williams, Stefan B.
Detection of artificial objects from underwater imagery gathered by Autonomous Underwater Vehicles (AUVs) is a key requirement for many subsea applications. Real-world AUV image datasets tend to be very large and unlabelled. Furthermore, such datasets are typically imbalanced, containing few instances of objects of interest, particularly when searching for unusual objects in a scene. It is therefore, difficult to fit models capable of reliably detecting these objects. Given these factors, we propose to treat artificial objects as anomalies and detect them through a semi-supervised framework based on Variational Autoencoders (VAEs). We develop a method which clusters image data in a learned low-dimensional latent space and extracts images that are likely to contain anomalous features. We also devise an anomaly score based on extracting poorly reconstructed regions of an image. We demonstrate that by applying both methods on large image datasets, human operators can be shown candidate anomalous samples with a low false positive rate to identify objects of interest. We apply our approach to real seafloor imagery gathered by an AUV and evaluate its sensitivity to the dimensionality of the latent representation used by the VAE. We evaluate the precision-recall tradeoff and demonstrate that by choosing an appropriate latent dimensionality and threshold, we are able to achieve an average precision of 0.64 on unlabelled datasets.
NTKCPL: Active Learning on Top of Self-Supervised Model by Estimating True Coverage
Wen, Ziting, Pizarro, Oscar, Williams, Stefan
High annotation cost for training machine learning classifiers has driven extensive research in active learning and self-supervised learning. Recent research has shown that in the context of supervised learning different active learning strategies need to be applied at various stages of the training process to ensure improved performance over the random baseline. We refer to the point where the number of available annotations changes the suitable active learning strategy as the phase transition point. In this paper, we establish that when combining active learning with self-supervised models to achieve improved performance, the phase transition point occurs earlier. It becomes challenging to determine which strategy should be used for previously unseen datasets. We argue that existing active learning algorithms are heavily influenced by the phase transition because the empirical risk over the entire active learning pool estimated by these algorithms is inaccurate and influenced by the number of labeled samples. To address this issue, we propose a novel active learning strategy, neural tangent kernel clustering-pseudo-labels (NTKCPL). It estimates empirical risk based on pseudo-labels and the model prediction with NTK approximation. We analyze the factors affecting this approximation error and design a pseudo-label clustering generation method to reduce the approximation error. We validate our method on five datasets, empirically demonstrating that it outperforms the baseline methods in most cases and is valid over a wider range of training budgets.
Improved Benthic Classification using Resolution Scaling and SymmNet Unsupervised Domain Adaptation
Doig, Heather, Pizarro, Oscar, Williams, Stefan B.
Autonomous Underwater Vehicles (AUVs) conduct regular visual surveys of marine environments to characterise and monitor the composition and diversity of the benthos. The use of machine learning classifiers for this task is limited by the low numbers of annotations available and the many fine-grained classes involved. In addition to these challenges, there are domain shifts between image sets acquired during different AUV surveys due to changes in camera systems, imaging altitude, illumination and water column properties leading to a drop in classification performance for images from a different survey where some or all these elements may have changed. This paper proposes a framework to improve the performance of a benthic morphospecies classifier when used to classify images from a different survey compared to the training data. We adapt the SymmNet state-of-the-art Unsupervised Domain Adaptation method with an efficient bilinear pooling layer and image scaling to normalise spatial resolution, and show improved classification accuracy. We test our approach on two datasets with images from AUV surveys with different imaging payloads and locations. The results show that generic domain adaptation can be enhanced to produce a significant increase in accuracy for images from an AUV survey that differs from the training images.