potted plant
SupplementaryforMixedSupervisedObject DetectionbyTransferringMaskPriorandSemantic Similarity
Our ablation studies (Table3in the main paper) havealready proved the advantage of mask prior. From Figure 2, we can see that the coarse masks indicate the rough locations of objects which can help the object detection network predicttheboundingboxes. Tovalidate the transferability ofour similarity transfer,we evaluate our similarity network trained on COCO-60 trainval set. Wetreat the similarity prediction task as abinary classification task, in which the binary label 1 (resp., 0) means that two bounding boxes belong to the same category (resp.,different The precision, recall and F1 scores are summarized in Table 1. We observe that the gap between the performance of similarity network on base categories and novel categories is negligible (e.g., F1 Scores 84.9% v.s.
Botany-Bot: Digital Twin Monitoring of Occluded and Underleaf Plant Structures with Gaussian Splats
Adebola, Simeon, Kim, Chung Min, Kerr, Justin, Xie, Shuangyu, Akella, Prithvi, Rincon, Jose Luis Susa, Solowjow, Eugen, Goldberg, Ken
Commercial plant phenotyping systems using fixed cameras cannot perceive many plant details due to leaf occlusion. In this paper, we present Botany-Bot, a system for building detailed "annotated digital twins" of living plants using two stereo cameras, a digital turntable inside a lightbox, an industrial robot arm, and 3D segmentated Gaussian Splat models. We also present robot algorithms for manipulating leaves to take high-resolution indexable images of occluded details such as stem buds and the underside/topside of leaves. Results from experiments suggest that Botany-Bot can segment leaves with 90.8% accuracy, detect leaves with 86.2% accuracy, lift/push leaves with 77.9% accuracy, and take detailed overside/underside images with 77.3% accuracy. Code, videos, and datasets are available at https://berkeleyautomation.github.io/Botany-Bot/.
Plant in Cupboard, Orange on Table, Book on Shelf. Benchmarking Practical Reasoning and Situation Modelling in a Text-Simulated Situated Environment
Jordan, Jonathan, Hakimov, Sherzod, Schlangen, David
Large language models (LLMs) have risen to prominence as 'chatbots' for users to interact via natural language. However, their abilities to capture common-sense knowledge make them seem promising as language-based planners of situated or embodied action as well. We have implemented a simple text-based environment -- similar to others that have before been used for reinforcement-learning of agents -- that simulates, very abstractly, a household setting. We use this environment and the detailed error-tracking capabilities we implemented for targeted benchmarking of LLMs on the problem of practical reasoning: Going from goals and observations to actions. Our findings show that environmental complexity and game restrictions hamper performance, and concise action planning is demanding for current LLMs.
Query-Dependent Prompt Evaluation and Optimization with Offline Inverse RL
Sun, Hao, Hรผyรผk, Alihan, van der Schaar, Mihaela
In this study, we aim to enhance the arithmetic reasoning ability of Large Language Models (LLMs) through zero-shot prompt optimization. We identify a previously overlooked objective of query dependency in such optimization and elucidate two ensuing challenges that impede the successful and economical design of prompt optimization techniques. One primary issue is the absence of an effective method to evaluate prompts during inference when the golden answer is unavailable. Concurrently, learning via interactions with the LLMs to navigate the expansive natural language prompting space proves to be resource-intensive. To address this, we introduce Prompt-OIRL, which harnesses offline inverse reinforcement learning to draw insights from offline prompting demonstration data. Such data exists as by-products when diverse prompts are benchmarked on open-accessible datasets. With Prompt-OIRL, the query-dependent prompt optimization objective is achieved by first learning an offline reward model. This model can evaluate any query-prompt pairs without accessing LLMs. Subsequently, a best-of-N strategy is deployed to recommend the optimal prompt. Our experimental evaluations across various LLM scales and arithmetic reasoning datasets underscore both the efficacy and economic viability of the proposed approach.
RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought
Xue, Tianci, Wang, Ziqi, Wang, Zhenhailong, Han, Chi, Yu, Pengfei, Ji, Heng
Large language Models (LLMs) have achieved promising performance on arithmetic reasoning tasks by incorporating step-by-step chain-of-thought (CoT) prompting. However, LLMs face challenges in maintaining factual consistency during reasoning, exhibiting tendencies to condition overlooking, question misinterpretation, and condition hallucination over given problems. Existing methods use coarse-grained feedback (e.g., whether the answer is correct) to improve factual consistency. In this work, we propose RCoT (Reversing Chain-of-Thought), a novel method to improve LLMs' reasoning abilities by automatically detecting and rectifying factual inconsistency in LLMs, generated solutions. To detect factual inconsistency, RCoT first asks LLMs to reconstruct the problem based on generated solutions. Then fine-grained comparisons between the original problem and the reconstructed problem expose the factual inconsistency in the original solutions. To rectify the solution, RCoT formulates detected factual inconsistency into fine-grained feedback to guide LLMs in revising solutions. Experimental results demonstrate improvements of RCoT over standard CoT, Self-Consistency and Self-Refine across seven arithmetic datasets. Moreover, we find that manually written fine-grained feedback can dramatically improve LLMs' reasoning abilities (e.g., ChatGPT reaches 94.6% accuracy on GSM8K), encouraging the community to further explore the fine-grained feedback generation methods.
The Strangeness of Our Animal Bonds
Last spring, I started boiling two eggs for breakfast every morning--one for me, and one for the crows. A mated pair patrolled the rooftops around my Berlin apartment building; I'd begun luring them to my balcony with peanuts and other snacks. They loved not only eggs but also mealworms, cat food, cashews, chicken hearts, stale bread, cheese, and chunks of lamb fat; they barely touched liver, walnuts, vegetables, and dried fruit. In Germany, we were under a COVID-19 lockdown. But the birds were free.
Object detection with deep learning and OpenCV - PyImageSearch
A couple weeks ago we learned how to classify images using deep learning and OpenCV 3.3's deep neural network ( dnn) module. While this original blog post demonstrated how we can categorize an image into one of ImageNet's 1,000 separate class labels it could not tell us where an object resides in image. In order to obtain the bounding box (x, y)-coordinates for an object in a image we need to instead apply object detection. Object detection can not only tell us what is in an image but also where the object is as well. In the remainder of today's blog post we'll discuss how to apply object detection using deep learning and OpenCV.
How AI Is Feeding China's Internet Dragon - Artificial Intelligence Online
Shortly after walking through the front doors of Baidu in Beijing last November, I was surprised to notice that my face had transformed into that of a cheerful- looking little dog. As I chatted with one of Baidu's AI researchers, the version of me shown on his smartphone had sprouted a very realistic-looking wet snout, fluffy ears, and a big pink tongue. The trick was performed on an app called Face You, released by Baidu last Halloween, which lets you add all sorts of spooky effects or animal characteristics to a digital image of your face. Face You makes use of an AI technique called deep learning to automatically identify key points on a person's face, so that software can then position and stretch a virtual mask with amazing accuracy. Deep learning is driving a lot more than just goofy apps at Baidu, though.