Freeman, Harry
Transformer-Based Spatio-Temporal Association of Apple Fruitlets
Freeman, Harry, Kantor, George
-- In this paper, we present a transformer-based method to spatio-temporally associate apple fruitlets in stereo-images collected on different days and from different camera poses. State-of-the-art association methods in agriculture are dedicated towards matching larger crops using either high-resolution point clouds or temporally stable features, which are both difficult to obtain for smaller fruit in the field. T o address these challenges, we propose a transformer-based architecture that encodes the shape and position of each fruitlet, and propagates and refines these features through a series of transformer encoder layers with alternating self and cross-attention. We demonstrate that our method is able to achieve an F1-score of 92.4% on data collected in a commercial apple orchard and outperforms all baselines and ablations. The global food supply is constantly under increasing pressure as a result of climate change, population growth, and increased labor shortages. To keep up with demand, agriculturalists are turning to computer vision-based systems that can automate a variety of laborious and time-intensive tasks such as harvesting [1], pruning [2], counting [3], and crop modeling [4]. These automated solutions not only improve efficiency, but also help mitigate the challenges posed by labor shortages and increasing food demand, ensuring that critical agricultural tasks can be performed reliably at scale. One particularly important but challenging task to automate is monitoring the growth and development of individual plants and fruits. Monitoring plant and fruit growth is important because it enables agricultural specialists to make more informed real-time crop management decisions and helps with downstream tasks such as phenotyping [5], disease management [6], and yield prediction [7].
Toward Semantic Scene Understanding for Fine-Grained 3D Modeling of Plants
Qadri, Mohamad, Freeman, Harry, Schneider, Eric, Kantor, George
Agricultural robotics is an active research area due to global population growth and expectations of food and labor shortages. Robots can potentially help with tasks such as pruning, harvesting, phenotyping, and plant modeling. However, agricultural automation is hampered by the difficulty in creating high resolution 3D semantic maps in the field that would allow for safe manipulation and navigation. In this paper, we build toward solutions for this issue and showcase how the use of semantics and environmental priors can help in constructing accurate 3D maps for the target application of sorghum. Specifically, we 1) use sorghum seeds as semantic landmarks to build a visual Simultaneous Localization and Mapping (SLAM) system that enables us to map 78\\% of a sorghum range on average, compared to 38% with ORB-SLAM2; and 2) use seeds as semantic features to improve 3D reconstruction of a full sorghum panicle from images taken by a robotic in-hand camera.
Autonomous Apple Fruitlet Sizing and Growth Rate Tracking using Computer Vision
Freeman, Harry, Qadri, Mohamad, Silwal, Abhisesh, O'Connor, Paul, Rubinstein, Zachary, Cooley, Daniel, Kantor, George
In this paper, we present a computer vision-based approach to measure the sizes and growth rates of apple fruitlets. Measuring the growth rates of apple fruitlets is important because it allows apple growers to determine when to apply chemical thinners to their crops in order to optimize yield. The current practice of obtaining growth rates involves using calipers to record sizes of fruitlets across multiple days. Due to the number of fruitlets needed to be sized, this method is laborious, time-consuming, and prone to human error. With images collected by a hand-held stereo camera, our system, segments, clusters, and fits ellipses to fruitlets to measure their diameters. The growth rates are then calculated by temporally associating clustered fruitlets across days. We provide quantitative results on data collected in an apple orchard, and demonstrate that our system is able to predict abscise rates within 3.5% of the current method with a 6 times improvement in speed, while requiring significantly less manual effort. Moreover, we provide results on images captured by a robotic system in the field, and discuss the next steps required to make the process fully autonomous.
Autonomous Apple Fruitlet Sizing with Next Best View Planning
Freeman, Harry, Kantor, George
In this paper, we present a next-best-view planning approach to autonomously size apple fruitlets. State-of-the-art viewpoint planners in agriculture are designed to size large and more sparsely populated fruit. They rely on lower resolution maps and sizing methods that do not generalize to smaller fruit sizes. To overcome these limitations, our method combines viewpoint sampling around semantically labeled regions of interest, along with an attention-guided information gain mechanism to more strategically select viewpoints that target the small fruits' volume. Additionally, we integrate a dual-map representation of the environment that is able to both speed up expensive ray casting operations and maintain the high occupancy resolution required to informatively plan around the fruit. When sizing, a robust estimation and graph clustering approach is introduced to associate fruit detections across images. Through simulated experiments, we demonstrate that our viewpoint planner improves sizing accuracy compared to state of the art and ablations. We also provide quantitative results on data collected by a real robotic system in the field.
3D Reconstruction-Based Seed Counting of Sorghum Panicles for Agricultural Inspection
Freeman, Harry, Schneider, Eric, Kim, Chung Hee, Lee, Moonyoung, Kantor, George
In this paper, we present a method for creating high-quality 3D models of sorghum panicles for phenotyping in breeding experiments. This is achieved with a novel reconstruction approach that uses seeds as semantic landmarks in both 2D and 3D. To evaluate the performance, we develop a new metric for assessing the quality of reconstructed point clouds without having a ground-truth point cloud. Finally, a counting method is presented where the density of seed centers in the 3D model allows 2D counts from multiple views to be effectively combined into a whole-panicle count. We demonstrate that using this method to estimate seed count and weight for sorghum outperforms count extrapolation from 2D images, an approach used in most state of the art methods for seeds and grains of comparable size.