Goto

Collaborating Authors

 Agricultural Chemicals


Supplementary Materials AGMMU: AComprehensive Agricultural Multimodal Understanding Benchmark Aruna Gauba1,2,5 Irene Pi1,3,5 Yunze Man1,4,5 Ziqi Pang1,4,5 Vikram S. Adve1,4,5 Yu-Xiong Wang1,4,5

Neural Information Processing Systems

Our evaluation and analysis are conducted mainly on the group of models listed in Table 2 in the13 main paper. We have chosen models such that they cover most of the popular and best-performing14 methods used by recent multimodal understanding work. In this part, we discuss all the models we15 have used in our experiments and explain their evaluation details, the public checkpoints we have16 chosen, and display the prompts we used to adapt the model to our datasets.17 During evaluation, we chose to follow the standard prompt provided by the authors whenever possi-18 ble for multiple-choice and short-answer questions. When the prompt is not provided for the model,19 we select a custom prompt that is created through several iterations of prompt engineering to select20 the one that produces the most effective results. The images are always included as the prefix.21 We used three proprietary models in our evaluation: GPT-o4-mini [1], Gem-22 ini 1.5 Pro [9], and Claude 3 Haiku [10]. Below we note the model API version used for evaluation.23 GPT-o4-mini: May 13-15, 2025.24 Cambrian-1 is a recent state-of-the-art model that excels at visual-centric tasks.27 This model explores combinations of vision encoders, text and image integration techniques, and28 instruction tuning strategies. We use the official implementation and checkpoint1 with a LLaMA3-29 8B-Instruct LLM backbone model in our evaluation.30 InternVL scales up the vision foundation model while aligning it with the back-31 bone LLM, and is trained on web-scale image-text data to achieve strong performance across a vari-32 ety of vision-centric tasks. We use the official implementation and checkpoint2 with the InternViT-33 300M-448px vision backbone and Internlm2.5-7B-chat LLaMA-3.2 is the first collection of multimodal large language model from the35 LLaMA family that was previously text-only. The integration of vision involves utilizing cross-36 attention layers and a pre-trained vision encoder that feeds directly into the text-processor. The37 model follows a commonly used training recipe that includes pretraining on noisy image-text pairs38 and then high-quality knowledge enhanced pairs. Notably, the language-model parameters were39 frozen during the training of alignment of image and text to retain strong text-only capabilities. We40 use the official implementation and checkpoint3 that uses a LLaMA-3.1 text-only language backbone41 in our evaluation. When evaluating the model, we choose to use a custom prompt since no standard42 prompt is provided.43


AGMMU: AComprehensive Agricultural Multimodal Understanding Benchmark

Neural Information Processing Systems

Unlike prior datasets that rely on crowdsourced prompts, AGMMU is distilled from 116,231 authentic dialogues between everyday growers and USDAauthorized Cooperative Extension experts. Through a three-stage pipeline: automated knowledge extraction, QA generation, and human verification, we construct (i) AGMMU, an evaluation set of 746 multiple-choice questions (MCQs) and 746 open-ended questions (OEQs), and (ii) AGBASE, a development corpus of 57,079 multimodal facts covering five high-stakes agricultural topics: insect identification, species identification, disease categorization, symptom description, and management instruction. AGMMU has three key advantages: Authentic & Expert-Verified: All facts, images, and answers originate from real farmer and gardener inquiries answered by credentialed specialists, ensuring high-fidelity agricultural knowledge. Complete Development Suite: AGMMU uniquely couples a dual-format evaluation benchmark (MCQ and OEQ) with AGBASE, a large-scale training set, enabling both rigorous assessment and targeted improvement of VLMs. Knowledge-intensive Challenge: Our tasks demand the synergy of nuanced visual perception and domain expertise, exposing fundamental limitations of current general-purpose models and charting a path toward robust, application-ready agricultural AI. Benchmarking 12 leading VLMs reveals pronounced gaps in fine-grained perception and factual grounding. Open-sourced models trail after proprietary ones by a wide margin. Simple fine-tuning on AGBASE boosts open-sourced model performance on challenging OEQs for up to 11.6% on average, narrowing this gap and also motivating future research to propose better strategies in knowledge extraction and distillation from AGBASE. We hope AGMMU stimulates research on domain-specific knowledge integration and trustworthy decision support in agriculture AI development.



Locust swarms may meet their match in protein-enriched crops

Popular Science

The specialized crops could save farmers millions. A swarm of desert locusts fly after an aircraft sprayed pesticide in Meru, Kenya in 2021. Breakthroughs, discoveries, and DIY tips sent six days a week. Swarms of locusts devouring a farmer's livelihood might sound apocalyptic, but major locust infestations are a regular problem in agricultural communities around the world. These locust swarms--dense, droning packs of certain grasshopper species--can cover hundreds of square miles, and the insects consume vast amounts of vegetation and threaten global agriculture.


Swiss startup turns urine into plant fertilizer

Popular Science

The space-inspired wastewater treatment uses the nutrients and loses the odor. Breakthroughs, discoveries, and DIY tips sent every weekday. When most people need to go number one, they find the nearest bathroom and don't give half a thought to what happens to their pee once it disappears down the toilet or urinal . It turns out that the nitrogen in human urine can be used in fertilizer. However, humanity's use of nitrogen is everything but efficient, according to a pair of siblings who founded the Swiss start-up company, VunaNexus.


Modular, On-Site Solutions with Lightweight Anomaly Detection for Sustainable Nutrient Management in Agriculture

arXiv.org Artificial Intelligence

Efficient nutrient management is critical for crop growth and sustainable resource consumption (e.g., nitrogen, energy). Current approaches require lengthy analyses, preventing real-time optimization; similarly, imaging facilitates rapid phenotyping but can be computationally intensive, preventing deployment under resource constraints. This study proposes a flexible, tiered pipeline for anomaly detection and status estimation (fresh weight, dry mass, and tissue nutrients), including a comprehensive energy analysis of approaches that span the efficiency-accuracy spectrum. Using a nutrient depletion experiment with three treatments (T1-100%, T2-50%, and T3-25% fertilizer strength) and multispectral imaging (MSI), we developed a hierarchical pipeline using an autoencoder (AE) for early warning. Further, we compared two status estimation modules of different complexity for more detailed analysis: vegetation index (VI) features with machine learning (Random Forest, RF) and raw whole-image deep learning (Vision Transformer, ViT). Results demonstrated high-efficiency anomaly detection (73% net detection of T3 samples 9 days after transplanting) at substantially lower energy than embodied energy in wasted nitrogen. The state estimation modules show trade-offs, with ViT outperforming RF on phosphorus and calcium estimation (R2 0.61 vs. 0.58, 0.48 vs. 0.35) at higher energy cost. With our modular pipeline, this work opens opportunities for edge diagnostics and practical opportunities for agricultural sustainability.


Rapid Machine Learning-Driven Detection of Pesticides and Dyes Using Raman Spectroscopy

arXiv.org Artificial Intelligence

The extensive use of pesticides and synthetic dyes poses critical threats to food safety, human health, and environmental sustainability, necessitating rapid and reliable detection methods. Raman spectroscopy offers molecularly specific fingerprints but suffers from spectral noise, fluorescence background, and band overlap, limiting its real-world applicability. Here, we propose a deep learning framework based on ResNet-18 feature extraction, combined with advanced classifiers, including XGBoost, SVM, and their hybrid integration, to detect pesticides and dyes from Raman spectroscopy, called MLRaman. The MLRaman with the CNN-XGBoost model achieved a predictive accuracy of 97.4% and a perfect AUC of 1.0, while it with the CNN-SVM model provided competitive results with robust class-wise discrimination. Dimensionality reduction analyses (PCA, t-SNE, UMAP) confirmed the separability of Raman embeddings across 10 analytes, including 7 pesticides and 3 dyes. Finally, we developed a user-friendly Streamlit application for real-time prediction, which successfully identified unseen Raman spectra from our independent experiments and also literature sources, underscoring strong generalization capacity. This study establishes a scalable, practical MLRaman model for multi-residue contaminant monitoring, with significant potential for deployment in food safety and environmental surveillance.


Hardware-Aware YOLO Compression for Low-Power Edge AI on STM32U5 for Weeds Detection in Digital Agriculture

arXiv.org Artificial Intelligence

Abstract--Weeds significantly reduce crop yields worldwide and pose major challenges to sustainable agriculture. Traditional weed management methods, primarily relying on chemical herbicides, risk environmental contamination and lead to the emergence of herbicide-resistant species. Precision weeding, leveraging computer vision and machine learning methods, offers a promising eco-friendly alternative but is often limited by reliance on high-power computational platforms. This work presents an optimized, low-power edge AI system for weeds detection based on the YOLOv8n object detector deployed on the STM32U575ZI microcontroller . Several compression techniques are applied to the detection model, including structured pruning, integer quantization and input image resolution scaling in order to meet strict hardware constraints. The model is trained and evaluated on the CropAndWeed dataset with 74 plant species, achieving a balanced trade-off between detection accuracy and efficiency. Our system supports real-time, in-situ weeds detection with a minimal energy consumption of 51.8mJ per inference, enabling scalable deployment in power-constrained agricultural environments. EEDS are widespread and persistent plants, known for their rapid reproduction and effective seed dispersal strategies. They are among the primary contributors to crop yield loss globally, posing a significant challenge for farmers and agricultural stakeholders [1].


Cultivating Pluralism In Algorithmic Monoculture: The Community Alignment Dataset

arXiv.org Artificial Intelligence

How can large language models (LLMs) serve users with varying preferences that may conflict across cultural, political, or other dimensions? To advance this challenge, this paper establishes four key results. First, we demonstrate, through a large-scale multilingual human study with representative samples from five countries (N=15,000), that humans exhibit significantly more variation in preferences than the responses of 21 state-of-the-art LLMs. Second, we show that existing methods for preference dataset collection are insufficient for learning the diversity of human preferences even along two of the most salient dimensions of variability in global values, due to the underlying homogeneity of candidate responses. Third, we argue that this motivates the need for negatively-correlated sampling when generating candidate sets, and we show that simple prompt-based techniques for doing so significantly enhance the performance of alignment methods in learning heterogeneous preferences. Fourth, based on this novel candidate sampling approach, we collect and open-source Community Alignment, the largest and most representative multilingual and multi-turn preference dataset to date, featuring almost 200,000 comparisons from annotators spanning five countries. We hope that the Community Alignment dataset will be a valuable resource for improving the effectiveness of LLMs for a diverse global population.


Towards Rational Pesticide Design with Graph Machine Learning Models for Ecotoxicology

arXiv.org Artificial Intelligence

This research focuses on rational pesticide design, using graph machine learning to accelerate the development of safer, eco-friendly agrochemicals, inspired by in silico methods in drug discovery. With an emphasis on ecotoxicology, the initial contributions include the creation of ApisTox, the largest curated dataset on pesticide toxicity to honey bees. We conducted a broad evaluation of machine learning (ML) models for molecular graph classification, including molecular fingerprints, graph kernels, GNNs, and pretrained transformers. The results show that methods successful in medicinal chemistry often fail to generalize to agrochemicals, underscoring the need for domain-specific models and benchmarks. Future work will focus on developing a comprehensive benchmarking suite and designing ML models tailored to the unique challenges of pesticide discovery.