Chowdhary, Girish
Towards Efficient Large Scale Spatial-Temporal Time Series Forecasting via Improved Inverted Transformers
Sun, Jiarui, Yeh, Chin-Chia Michael, Fan, Yujie, Dai, Xin, Fan, Xiran, Jiang, Zhimeng, Saini, Uday Singh, Lai, Vivian, Wang, Junpeng, Chen, Huiyuan, Zhuang, Zhongfang, Zheng, Yan, Chowdhary, Girish
Time series forecasting at scale presents significant challenges for modern prediction systems, particularly when dealing with large sets of synchronized series, such as in a global payment network. In such systems, three key challenges must be overcome for accurate and scalable predictions: 1) emergence of new entities, 2) disappearance of existing entities, and 3) the large number of entities present in the data. The recently proposed Inverted Transformer (iTransformer) architecture has shown promising results by effectively handling variable entities. However, its practical application in large-scale settings is limited by quadratic time and space complexity ($O(N^2)$) with respect to the number of entities $N$. In this paper, we introduce EiFormer, an improved inverted transformer architecture that maintains the adaptive capabilities of iTransformer while reducing computational complexity to linear scale ($O(N)$). Our key innovation lies in restructuring the attention mechanism to eliminate redundant computations without sacrificing model expressiveness. Additionally, we incorporate a random projection mechanism that not only enhances efficiency but also improves prediction accuracy through better feature representation. Extensive experiments on the public LargeST benchmark dataset and a proprietary large-scale time series dataset demonstrate that EiFormer significantly outperforms existing methods in both computational efficiency and forecasting accuracy. Our approach enables practical deployment of transformer-based forecasting in industrial applications where handling time series at scale is essential.
BYON: Bring Your Own Networks for Digital Agriculture Applications
Sie, Emerson, Tao, Bill, Mihigo, Aganze, Karmehan, Parithimaal, Zhang, Max, Sivakumar, Arun N., Chowdhary, Girish, Vasisht, Deepak
Digital agriculture technologies rely on sensors, drones, robots, and autonomous farm equipment to improve farm yields and incorporate sustainability practices. However, the adoption of such technologies is severely limited by the lack of broadband connectivity in rural areas. We argue that farming applications do not require permanent always-on connectivity. Instead, farming activity and digital agriculture applications follow seasonal rhythms of agriculture. Therefore, the need for connectivity is highly localized in time and space. We introduce BYON, a new connectivity model for high bandwidth agricultural applications that relies on emerging connectivity solutions like citizens broadband radio service (CBRS) and satellite networks. BYON creates an agile connectivity solution that can be moved along a farm to create spatio-temporal connectivity bubbles. BYON incorporates a new gateway design that reacts to the presence of crops and optimizes coverage in agricultural settings. We evaluate BYON in a production farm and demonstrate its benefits.
Physics-Informed Neural Network based Damage Identification for Truss Railroad Bridges
Shajihan, Althaf, Mechitov, Kirill, Chowdhary, Girish, Spencer, Billie F. Jr
Railroad bridges are a crucial component of the U.S. freight rail system, which moves over 40 percent of the nation's freight and plays a critical role in the economy. However, aging bridge infrastructure and increasing train traffic pose significant safety hazards and risk service disruptions. The U.S. rail network includes over 100,000 railroad bridges, averaging one every 1.4 miles of track, with steel bridges comprising over 50% of the network's total bridge length. Early identification and assessment of damage in these bridges remain challenging tasks. This study proposes a physics-informed neural network (PINN) based approach for damage identification in steel truss railroad bridges. The proposed approach employs an unsupervised learning approach, eliminating the need for large datasets typically required by supervised methods. The approach utilizes train wheel load data and bridge response during train crossing events as inputs for damage identification. The PINN model explicitly incorporates the governing differential equations of the linear time-varying (LTV) bridge-train system. Herein, this model employs a recurrent neural network (RNN) based architecture incorporating a custom Runge-Kutta (RK) integrator cell, designed for gradient-based learning. The proposed approach updates the bridge finite element model while also quantifying damage severity and localizing the affected structural members. A case study on the Calumet Bridge in Chicago, Illinois, with simulated damage scenarios, is used to demonstrate the model's effectiveness in identifying damage while maintaining low false-positive rates. Furthermore, the damage identification pipeline is designed to seamlessly integrate prior knowledge from inspections and drone surveys, also enabling context-aware updating and assessment of bridge's condition.
Precision Harvesting in Cluttered Environments: Integrating End Effector Design with Dual Camera Perception
Koe, Kendall, Shah, Poojan Kalpeshbhai, Walt, Benjamin, Westphal, Jordan, Marri, Samhita, Kamtikar, Shivani, Nam, James Seungbum, Uppalapati, Naveen Kumar, Krishnan, Girish, Chowdhary, Girish
Abstract-- Due to labor shortages in specialty crop industries, a need for robotic automation to increase agricultural efficiency and productivity has arisen. Previous manipulation systems perform well in harvesting in uncluttered and structured environments. High tunnel environments are more compact and cluttered in nature, requiring a rethinking of the large form factor systems and grippers. We propose a novel codesigned framework incorporating a global detection camera and a local eye-in-hand camera that demonstrates precise localization of small fruits via closed-loop visual feedback and reliable error handling. Field experiments in high tunnels show our system can reach an average of 85.0% of cherry tomato fruit in 10.98s on average. I. INTRODUCTION Decreasing food miles and increasing sustainable agricultural practices have prompted interest in urban agriculture Figure 1: Robot picking cherry tomatoes with our Detect2Grasp in recent years.
Active Semantic Mapping with Mobile Manipulator in Horticultural Environments
Cuaran, Jose, Ahluwalia, Kulbir Singh, Koe, Kendall, Uppalapati, Naveen Kumar, Chowdhary, Girish
Semantic maps are fundamental for robotics tasks such as navigation and manipulation. They also enable yield prediction and phenotyping in agricultural settings. In this paper, we introduce an efficient and scalable approach for active semantic mapping in horticultural environments, employing a mobile robot manipulator equipped with an RGB-D camera. Our method leverages probabilistic semantic maps to detect semantic targets, generate candidate viewpoints, and compute corresponding information gain. We present an efficient ray-casting strategy and a novel information utility function that accounts for both semantics and occlusions. The proposed approach reduces total runtime by 8% compared to previous baselines. Furthermore, our information metric surpasses other metrics in reducing multi-class entropy and improving surface coverage, particularly in the presence of segmentation noise. Real-world experiments validate our method's effectiveness but also reveal challenges such as depth sensor noise and varying environmental conditions, requiring further research.
MetaCropFollow: Few-Shot Adaptation with Meta-Learning for Under-Canopy Navigation
Woehrle, Thomas, Sivakumar, Arun N., Uppalapati, Naveen, Chowdhary, Girish
Autonomous under-canopy navigation faces additional challenges compared to over-canopy settings - for example the tight spacing between the crop rows, degraded GPS accuracy and excessive clutter. Keypoint-based visual navigation has been shown to perform well in these conditions, however the differences between agricultural environments in terms of lighting, season, soil and crop type mean that a domain shift will likely be encountered at some point of the robot deployment. In this paper, we explore the use of Meta-Learning to overcome this domain shift using a minimal amount of data. We train a base-learner that can quickly adapt to new conditions, enabling more robust navigation in low-data regimes.
CropNav: a Framework for Autonomous Navigation in Real Farms
Gasparino, Mateus Valverde, Higuti, Vitor Akihiro Hisano, Sivakumar, Arun Narenthiran, Velasquez, Andres Eduardo Baquero, Becker, Marcelo, Chowdhary, Girish
Small robots that can operate under the plant canopy can enable new possibilities in agriculture. However, unlike larger autonomous tractors, autonomous navigation for such under canopy robots remains an open challenge because Global Navigation Satellite System (GNSS) is unreliable under the plant canopy. We present a hybrid navigation system that autonomously switches between different sets of sensing modalities to enable full field navigation, both inside and outside of crop. By choosing the appropriate path reference source, the robot can accommodate for loss of GNSS signal quality and leverage row-crop structure to autonomously navigate. However, such switching can be tricky and difficult to execute over scale. Our system provides a solution by automatically switching between an exteroceptive sensing based system, such as Light Detection And Ranging (LiDAR) row-following navigation and waypoints path tracking. In addition, we show how our system can detect when the navigate fails and recover automatically extending the autonomous time and mitigating the necessity of human intervention. Our system shows an improvement of about 750 m per intervention over GNSS-based navigation and 500 m over row following navigation.
Fed-EC: Bandwidth-Efficient Clustering-Based Federated Learning For Autonomous Visual Robot Navigation
Gummadi, Shreya, Gasparino, Mateus V., Vasisht, Deepak, Chowdhary, Girish
Centralized learning requires data to be aggregated at a central server, which poses significant challenges in terms of data privacy and bandwidth consumption. Federated learning presents a compelling alternative, however, vanilla federated learning methods deployed in robotics aim to learn a single global model across robots that works ideally for all. But in practice one model may not be well suited for robots deployed in various environments. This paper proposes Federated-EmbedCluster (Fed-EC), a clustering-based federated learning framework that is deployed with vision based autonomous robot navigation in diverse outdoor environments. The framework addresses the key federated learning challenge of deteriorating model performance of a single global model due to the presence of non-IID data across real-world robots. Extensive real-world experiments validate that Fed-EC reduces the communication size by 23x for each robot while matching the performance of centralized learning for goal-oriented navigation and outperforms local learning. Fed-EC can transfer previously learnt models to new robots that join the cluster.
AdaCropFollow: Self-Supervised Online Adaptation for Visual Under-Canopy Navigation
Sivakumar, Arun N., Magistri, Federico, Gasparino, Mateus V., Behley, Jens, Stachniss, Cyrill, Chowdhary, Girish
Under-canopy agricultural robots can enable various applications like precise monitoring, spraying, weeding, and plant manipulation tasks throughout the growing season. Autonomous navigation under the canopy is challenging due to the degradation in accuracy of RTK-GPS and the large variability in the visual appearance of the scene over time. In prior work, we developed a supervised learning-based perception system with semantic keypoint representation and deployed this in various field conditions. A large number of failures of this system can be attributed to the inability of the perception model to adapt to the domain shift encountered during deployment. In this paper, we propose a self-supervised online adaptation method for adapting the semantic keypoint representation using a visual foundational model, geometric prior, and pseudo labeling. Our preliminary experiments show that with minimal data and fine-tuning of parameters, the keypoint prediction model trained with labels on the source domain can be adapted in a self-supervised manner to various challenging target domains onboard the robot computer using our method. This can enable fully autonomous row-following capability in under-canopy robots across fields and crops without requiring human intervention.
W-RIZZ: A Weakly-Supervised Framework for Relative Traversability Estimation in Mobile Robotics
Schreiber, Andre, Sivakumar, Arun N., Du, Peter, Gasparino, Mateus V., Chowdhary, Girish, Driggs-Campbell, Katherine
Successful deployment of mobile robots in unstructured domains requires an understanding of the environment and terrain to avoid hazardous areas, getting stuck, and colliding with obstacles. Traversability estimation--which predicts where in the environment a robot can travel--is one prominent approach that tackles this problem. Existing geometric methods may ignore important semantic considerations, while semantic segmentation approaches involve a tedious labeling process. Recent self-supervised methods reduce labeling tedium, but require additional data or models and tend to struggle to explicitly label untraversable areas. To address these limitations, we introduce a weakly-supervised method for relative traversability estimation. Our method involves manually annotating the relative traversability of a small number of point pairs, which significantly reduces labeling effort compared to traditional segmentation-based methods and avoids the limitations of self-supervised methods. We further improve the performance of our method through a novel cross-image labeling strategy and loss function. We demonstrate the viability and performance of our method through deployment on a mobile robot in outdoor environments.