Antarctica
Autonomous Forest Inventory with Legged Robots: System Design and Field Deployment
Mattamala, Matías, Chebrolu, Nived, Casseau, Benoit, Freißmuth, Leonard, Frey, Jonas, Tuna, Turcan, Hutter, Marco, Fallon, Maurice
We present a solution for autonomous forest inventory with a legged robotic platform. Compared to their wheeled and aerial counterparts, legged platforms offer an attractive balance of endurance and low soil impact for forest applications. In this paper, we present the complete system architecture of our forest inventory solution which includes state estimation, navigation, mission planning, and real-time tree segmentation and trait estimation. We present preliminary results for three campaigns in forests in Finland and the UK and summarize the main outcomes, lessons, and challenges. Our UK experiment at the Forest of Dean with the ANYmal D legged platform, achieved an autonomous survey of a 0.96 hectare plot in 20 min, identifying over 100 trees with typical DBH accuracy of 2 cm.
Unlock the Future of Autonomous Drones with Innovative Secure Runtime Assurance (SRTA)
By submitting this content request, I have legitimate interest in the content and agree that Technology Innovation Institute, their partners, and the creators of any other content I have selected may contact me regarding news, products, and services that may be of interest to me. By submitting this content request, I have legitimate interest in the content and agree that Technology Innovation Institute, their partners, and the creators of any other content I have selected may contact me regarding news, products, and services that may be of interest to me. I agree to the IEEE Privacy Policy Are you an IEEE member?
KnowTuning: Knowledge-aware Fine-tuning for Large Language Models
Lyu, Yougang, Yan, Lingyong, Wang, Shuaiqiang, Shi, Haibo, Yin, Dawei, Ren, Pengjie, Chen, Zhumin, de Rijke, Maarten, Ren, Zhaochun
Despite their success at many natural language processing (NLP) tasks, large language models still struggle to effectively leverage knowledge for knowledge-intensive tasks, manifesting limitations such as generating incomplete, non-factual, or illogical answers. These limitations stem from inadequate knowledge awareness of LLMs during vanilla fine-tuning. To address these problems, we propose a knowledge-aware fine-tuning (KnowTuning) method to improve fine-grained and coarse-grained knowledge awareness of LLMs. We devise a fine-grained knowledge augmentation stage to train LLMs to identify difficult fine-grained knowledge in answers. We also propose a coarse-grained knowledge comparison stage to train LLMs to distinguish between reliable and unreliable knowledge, in three aspects: completeness, factuality, and logicality. Extensive experiments on both generic and medical question answering (QA) datasets confirm the effectiveness of KnowTuning, through automatic and human evaluations, across various sizes of LLMs. We further verify that KnowTuning generates more facts with less factual error rate under fine-grained facts evaluation.
BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language Models
Wiland, Jacek, Ploner, Max, Akbik, Alan
Knowledge probing assesses to which degree a language model (LM) has successfully learned relational knowledge during pre-training. Probing is an inexpensive way to compare LMs of different sizes and training configurations. However, previous approaches rely on the objective function used in pre-training LMs and are thus applicable only to masked or causal LMs. As a result, comparing different types of LMs becomes impossible. To address this, we propose an approach that uses an LM's inherent ability to estimate the log-likelihood of any given textual statement. We carefully design an evaluation dataset of 7,731 instances (40,916 in a larger variant) from which we produce alternative statements for each relational fact, one of which is correct. We then evaluate whether an LM correctly assigns the highest log-likelihood to the correct statement. Our experimental evaluation of 22 common LMs shows that our proposed framework, BEAR, can effectively probe for knowledge across different LM types. We release the BEAR datasets and an open-source framework that implements the probing approach to the research community to facilitate the evaluation and development of LMs.
Privacy Backdoors: Stealing Data with Corrupted Pretrained Models
Feng, Shanglun, Tramèr, Florian
Practitioners commonly download pretrained machine learning models from open repositories and finetune them to fit specific applications. We show that this practice introduces a new risk of privacy backdoors. By tampering with a pretrained model's weights, an attacker can fully compromise the privacy of the finetuning data. We show how to build privacy backdoors for a variety of models, including transformers, which enable an attacker to reconstruct individual finetuning samples, with a guaranteed success! We further show that backdoored models allow for tight privacy attacks on models trained with differential privacy (DP). The common optimistic practice of training DP models with loose privacy guarantees is thus insecure if the model is not trusted. Overall, our work highlights a crucial and overlooked supply chain attack on machine learning privacy.
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Henschel, Roberto, Khachatryan, Levon, Hayrapetyan, Daniil, Poghosyan, Hayk, Tadevosyan, Vahram, Wang, Zhangyang, Navasardyan, Shant, Shi, Humphrey
Text-to-video diffusion models enable the generation of high-quality videos that follow text instructions, making it easy to create diverse and individual content. However, existing approaches mostly focus on high-quality short video generation (typically 16 or 24 frames), ending up with hard-cuts when naively extended to the case of long video synthesis. To overcome these limitations, we introduce StreamingT2V, an autoregressive approach for long video generation of 80, 240, 600, 1200 or more frames with smooth transitions. The key components are:(i) a short-term memory block called conditional attention module (CAM), which conditions the current generation on the features extracted from the previous chunk via an attentional mechanism, leading to consistent chunk transitions, (ii) a long-term memory block called appearance preservation module, which extracts high-level scene and object features from the first video chunk to prevent the model from forgetting the initial scene, and (iii) a randomized blending approach that enables to apply a video enhancer autoregressively for infinitely long videos without inconsistencies between chunks. Experiments show that StreamingT2V generates high motion amount. In contrast, all competing image-to-video methods are prone to video stagnation when applied naively in an autoregressive manner. Thus, we propose with StreamingT2V a high-quality seamless text-to-long video generator that outperforms competitors with consistency and motion. Our code will be available at: https://github.com/Picsart-AI-Research/StreamingT2V
A Parallel Workflow for Polar Sea-Ice Classification using Auto-labeling of Sentinel-2 Imagery
Iqrah, Jurdana Masuma, Wang, Wei, Xie, Hongjie, Prasad, Sushil
The observation of the advancing and retreating pattern of polar sea ice cover stands as a vital indicator of global warming. This research aims to develop a robust, effective, and scalable system for classifying polar sea ice as thick/snow-covered, young/thin, or open water using Sentinel-2 (S2) images. Since the S2 satellite is actively capturing high-resolution imagery over the earth's surface, there are lots of images that need to be classified. One major obstacle is the absence of labeled S2 training data (images) to act as the ground truth. We demonstrate a scalable and accurate method for segmenting and automatically labeling S2 images using carefully determined color thresholds. We employ a parallel workflow using PySpark to scale and achieve 9-fold data loading and 16-fold map-reduce speedup on auto-labeling S2 images based on thin cloud and shadow-filtered color-based segmentation to generate label data. The auto-labeled data generated from this process are then employed to train a U-Net machine learning model, resulting in good classification accuracy. As training the U-Net classification model is computationally heavy and time-consuming, we distribute the U-Net model training to scale it over 8 GPUs using the Horovod framework over a DGX cluster with a 7.21x speedup without affecting the accuracy of the model. Using the Antarctic's Ross Sea region as an example, the U-Net model trained on auto-labeled data achieves a classification accuracy of 98.97% for auto-labeled training datasets when the thin clouds and shadows from the S2 images are filtered out.
Smooth-projected Neighborhood Pursuit for High-dimensional Nonparanormal Graph Estimation
We introduce a new learning algorithm, named smooth-projected neighborhood pursuit, for estimating high dimensional undirected graphs. In particularly, we focus on the nonparanormal graphical model and provide theoretical guarantees for graph estimation consistency. In addition to new computational and theoretical analysis, we also provide an alternative view to analyze the tradeoff between computational efficiency and statistical error under a smoothing optimization framework. Numerical results on both synthetic and real datasets are provided to support our theory.
Laser-to-Vehicle Extrinsic Calibration in Low-Observability Scenarios for Subsea Mapping
Hitchcox, Thomas, Forbes, James Richard
Laser line scanners are increasingly being used in the subsea industry for high-resolution mapping and infrastructure inspection. However, calibrating the 3D pose of the scanner relative to the vehicle is a perennial source of confusion and frustration for industrial surveyors. This work describes three novel algorithms for laser-to-vehicle extrinsic calibration using naturally occurring features. Each algorithm makes a different assumption on the quality of the vehicle trajectory estimate, enabling good calibration results in a wide range of situations. A regularization technique is used to address low-observability scenarios frequently encountered in practice with large, rotationally stable subsea vehicles. Experimental results are provided for two field datasets, including the recently discovered wreck of the Endurance.
Fast Ergodic Search with Kernel Functions
Sun, Muchen, Gaggar, Ayush, Trautman, Peter, Murphey, Todd
Ergodic search enables optimal exploration of an information distribution while guaranteeing the asymptotic coverage of the search space. However, current methods typically have exponential computation complexity in the search space dimension and are restricted to Euclidean space. We introduce a computationally efficient ergodic search method. Our contributions are two-fold. First, we develop a kernel-based ergodic metric and generalize it from Euclidean space to Lie groups. We formally prove the proposed metric is consistent with the standard ergodic metric while guaranteeing linear complexity in the search space dimension. Secondly, we derive the first-order optimality condition of the kernel ergodic metric for nonlinear systems, which enables efficient trajectory optimization. Comprehensive numerical benchmarks show that the proposed method is at least two orders of magnitude faster than the state-of-the-art algorithm. Finally, we demonstrate the proposed algorithm with a peg-in-hole insertion task. We formulate the problem as a coverage task in the space of SE(3) and use a 30-second-long human demonstration as the prior distribution for ergodic coverage. Ergodicity guarantees the asymptotic solution of the peg-in-hole problem so long as the solution resides within the prior information distribution, which is seen in the 100\% success rate.