Goto

Collaborating Authors

 tla








Using 3D reconstruction from image motion to predict total leaf area in dwarf tomato plants

Usenko, Dmitrii, Helman, David, Giladi, Chen

arXiv.org Artificial Intelligence

Accurate estimation of total leaf area (TLA) is essential for assessing plant growth, photosynthetic activity, and transpiration but remains a challenge for bushy plants like dwarf tomatoes. Traditional destructive methods and imaging-based techniques often fall short due to labor intensity, plant damage, or the inability to capture complex canopies. This study evaluated a non-destructive method combining sequential 3D reconstructions from RGB images and machine learning to estimate TLA for three dwarf tomato cultivars-- Mohamed, Hahms Gelbe Topftomate, and Red Robin--grown under controlled greenhouse conditions. Two experiments, conducted in spring-summer and autumn-winter, included 73 plants, yielding 418 TLA measurements using an "onion" approach, where layers of leaves were sequentially removed and scanned. High-resolution videos were recorded from multiple angles for each plant, and 500 frames were extracted per plant for 3D reconstruction. Point clouds were created and processed, four reconstruction algorithms (Alpha Shape, Marching Cubes, Poisson's, and Ball Pivoting) were tested, and meshes were evaluated using seven regression models: Multivariable Linear Regression (MLR), Lasso Regression (Lasso), Ridge Regression (Ridge-Reg), Elastic Net Regression (ENR), Random Forest (RF), extreme gradient boosting (XGBoost), and Multilayer Perceptron (MLP). The Alpha Shape reconstruction (α = 3) combined with XGBoost yielded the best performance, achieving an R of 0.80 and MAE of 489 cm These findings demonstrate the robustness of our approach across variable environmental conditions and canopy structures. This scalable, automated TLA estimation method is particularly suited for urban farming and precision agriculture, offering practical implications for automated pruning, improved resource efficiency, and sustainable food production. Keywords: Total leaf area, dwarf tomato, point cloud, mesh reconstruction, machine learning, precision agriculture 1. Introduction Total leaf area (TLA) is a comprehensive metric describing the plant's growth and functioning. It is a primary metric that describes the plant's photosynthetic activity and transpiration capacity. Normalized by the plant's surface area, TLA may provide information on the canopy structure, which is crucial for understanding the plant's energy and resource efficiency. For example, reduced TLA is a sign of stress (Dong et al., 2019), while excessive biomass, indicated by a higher TLA, signifies lower water use efficiency (Glenn et al., 2006). Farmers often use pruning to reduce TLA in commercial crops to increase crop productivity (Budiarto et al., 2023). However, measuring and finding the optimum TLA of the crop are challenging tasks.


TLA: Tactile-Language-Action Model for Contact-Rich Manipulation

Hao, Peng, Zhang, Chaofan, Li, Dingzhe, Cao, Xiaoge, Hao, Xiaoshuai, Cui, Shaowei, Wang, Shuo

arXiv.org Artificial Intelligence

Significant progress has been made in vision-language models. However, language-conditioned robotic manipulation for contact-rich tasks remains underexplored, particularly in terms of tactile sensing. To address this gap, we introduce the Tactile-Language-Action (TLA) model, which effectively processes sequential tactile feedback via cross-modal language grounding to enable robust policy generation in contact-intensive scenarios. In addition, we construct a comprehensive dataset that contains 24k pairs of tactile action instruction data, customized for fingertip peg-in-hole assembly, providing essential resources for TLA training and evaluation. Our results show that TLA significantly outperforms traditional imitation learning methods (e.g., diffusion policy) in terms of effective action generation and action accuracy, while demonstrating strong generalization capabilities by achieving over 85\% success rate on previously unseen assembly clearances and peg shapes. We publicly release all data and code in the hope of advancing research in language-conditioned tactile manipulation skill learning. Project website: https://sites.google.com/view/tactile-language-action/


Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models

Adiga, Rishabh, Nushi, Besmira, Chandrasekaran, Varun

arXiv.org Artificial Intelligence

We explore the internal mechanisms of how bias emerges in large language models (LLMs) when provided with ambiguous comparative prompts: inputs that compare or enforce choosing between two or more entities without providing clear context for preference. Most approaches for bias mitigation focus on either post-hoc analysis or data augmentation. However, these are transient solutions, without addressing the root cause: the model itself. Numerous prior works show the influence of the attention module towards steering generations. We believe that analyzing attention is also crucial for understanding bias, as it provides insight into how the LLM distributes its focus across different entities and how this contributes to biased decisions. To this end, we first introduce a metric to quantify the LLM's preference for one entity over another. We then propose $\texttt{ATLAS}$ (Attention-based Targeted Layer Analysis and Scaling), a technique to localize bias to specific layers of the LLM by analyzing attention scores and then reduce bias by scaling attention in these biased layers. To evaluate our method, we conduct experiments across 3 datasets (BBQ, Crows-Pairs, and WinoGender) using $\texttt{GPT-2 XL}$ (1.5B), $\texttt{GPT-J}$ (6B), $\texttt{LLaMA-2}$ (7B) and $\texttt{LLaMA-3}$ (8B). Our experiments demonstrate that bias is concentrated in the later layers, typically around the last third. We also show how $\texttt{ATLAS}$ effectively mitigates bias through targeted interventions without compromising downstream performance and an average increase of only 0.82% in perplexity when the intervention is applied. We see an average improvement of 0.28 points in the bias score across all the datasets.


RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models

Huang, Jie, Ping, Wei, Xu, Peng, Shoeybi, Mohammad, Chang, Kevin Chen-Chuan, Catanzaro, Bryan

arXiv.org Artificial Intelligence

In this paper, we investigate the in-context learning ability of retrieval-augmented encoder-decoder language models. We first conduct a comprehensive analysis of the state-of-the-art ATLAS model and identify its limitations in in-context learning, primarily due to a mismatch between pretraining and testing, as well as a restricted context length. To address these issues, we propose RAVEN, a model that combines retrieval-augmented masked language modeling and prefix language modeling. We further introduce Fusion-in-Context Learning to enhance the few-shot performance by enabling the model to leverage more in-context examples without requiring additional training or model modifications. Through extensive experiments, we demonstrate that RAVEN significantly outperforms ATLAS and achieves results comparable to the most advanced language models in certain scenarios, despite having substantially fewer parameters. Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning and encourages further research in this direction.