Goto

Collaborating Authors

 vegetable


A Data Analysis The LoRA Dataset Project page:https: //lora-vqa.github.io/

Neural Information Processing Systems

Each question and answer group has a unique list of corresponding visuals used for image creation. The list of visible objects, which combines the correct-answer objects with an arbitrary'noise' object


Understanding LLM Reasoning for Abstractive Summarization

Yuan, Haohan, Zhang, Haopeng

arXiv.org Artificial Intelligence

While the reasoning capabilities of Large Language Models (LLMs) excel in analytical tasks such as mathematics and code generation, their utility for abstractive summarization remains widely assumed but largely unverified. To bridge this gap, we first tailor general reasoning strategies to the summarization domain. We then conduct a systematic, large scale comparative study of 8 reasoning strategies and 3 Large Reasoning Models (LRMs) across 8 diverse datasets, assessing both summary quality and faithfulness. Our findings show that reasoning is not a universal solution and its effectiveness is highly dependent on the specific strategy and context. Specifically, we observe a trade-off between summary quality and factual faithfulness: explicit reasoning strategies tend to improve fluency at the expense of factual grounding, while implicit reasoning in LRMs exhibits the inverse pattern. Furthermore, increasing an LRM's internal reasoning budget does not improve, and can even hurt, factual consistency, suggesting that effective summarization demands faithful compression rather than creative over-thinking.


The Best Chef's Knives of 2025. We Tested Nearly Two Dozen to Find Our Favorites

WIRED

The chef's knife is the workhorse of the kitchen. We sliced, diced, and minced to find the best for every home chef. A Close Second Chef's Knife (Made From High-Carbon Stainless Steel) Zwilling Four Star 8-Inch Chef's Knife Not all knives are created equal, and a chef's knife is given that name for a reason. Like the proverbial dog to man, a chef needs their knife. Arguably the most important multipurpose tool you can find in a kitchen, it's the chef's main weapon--it can slice, dice, and chop ingredients with speed and precision. A chef's knife generally has a super-sharp end point and a curved, sloping edge. This curve is what makes the chef knife stand out, as it's designed to work with the natural rocking motion for quick chopping that also allows for finer cuts. With technology like ovens with cameras inside and AI-enabled refrigerators, the chef's knife remains the simple tool necessary for any kitchen.


Toward Accurate Long-Horizon Robotic Manipulation: Language-to-Action with Foundation Models via Scene Graphs

Dinesh, Sushil Samuel, Park, Shinkyu

arXiv.org Artificial Intelligence

This paper presents a framework that leverages pre-trained foundation models for robotic manipulation without domain-specific training. The framework integrates off-the-shelf models, combining multimodal perception from foundation models with a general-purpose reasoning model capable of robust task sequencing. Scene graphs, dynamically maintained within the framework, provide spatial awareness and enable consistent reasoning about the environment. The framework is evaluated through a series of tabletop robotic manipulation experiments, and the results highlight its potential for building robotic manipulation systems directly on top of off-the-shelf foundation models.



What are ultra-processed foods and are they bad for me?

Popular Science

Amazon Prime Day is live. See the best deals HERE. What are ultra-processed foods and are they bad for me? From frozen pizza to lactose-free milk, food processing takes many forms. Here's how to tell when it's helping or hurting us.


IA-VLA: Input Augmentation for Vision-Language-Action models in settings with semantically complex tasks

Hannus, Eric, Malin, Miika, Le, Tran Nguyen, Kyrki, Ville

arXiv.org Artificial Intelligence

Figure 1: Semantically complex language instructions, such as those involving the relative positions of objects, pose a difficult challenge for vision-language-action models (VLAs). To address this problem, we propose IA-VLA, a framework for augmenting the input to VLAs, that offloads the semantic understanding to a larger vision language model (VLM) with greater semantic understanding. We use semantic segmentation to label image regions which a VLM then uses to identify the masks of the task-relevant objects. The task-relevant objects are highlighted in the VLA input, together with the language instruction which can optionally be simplified. Abstract-- Vision-language-action models (VLAs) have become an increasingly popular approach for addressing robot manipulation problems in recent years. However, such models need to output actions at a rate suitable for robot control, which limits the size of the language model they can be based on, and consequently, their language understanding capabilities. Manipulation tasks may require complex language instructions, such as identifying target objects by their relative positions, to specify human intention. Therefore, we introduce IA-VLA, a framework that utilizes the extensive language understanding of a large vision language model as a pre-processing stage to generate improved context to augment the input of a VLA. We evaluate the framework on a set of semantically complex tasks which have been underexplored in VLA literature, namely tasks involving visual duplicates, i.e., visually indistinguishable objects.


New broccoli hybrid can thrive in colder climates

Popular Science

Breakthroughs, discoveries, and DIY tips sent every weekday. Love it or loathe it, broccoli is one of the most popular vegetables in the United States. However, this staple vegetable can be as finicky as a picky eater when it comes to its growth . It is a temperate crop that likes cooler nights and predictable weather in order to thrive. Both of these conditions are getting much harder to come by due to climate change .


Compose by Focus: Scene Graph-based Atomic Skills

Qi, Han, Chen, Changhe, Yang, Heng

arXiv.org Artificial Intelligence

A key requirement for generalist robots is compositional generalization - the ability to combine atomic skills to solve complex, long-horizon tasks. While prior work has primarily focused on synthesizing a planner that sequences pre-learned skills, robust execution of the individual skills themselves remains challenging, as visuomotor policies often fail under distribution shifts induced by scene composition. To address this, we introduce a scene graph-based representation that focuses on task-relevant objects and relations, thereby mitigating sensitivity to irrelevant variation. Building on this idea, we develop a scene-graph skill learning framework that integrates graph neural networks with diffusion-based imitation learning, and further combine "focused" scene-graph skills with a vision-language model (VLM) based task planner. Experiments in both simulation and real-world manipulation tasks demonstrate substantially higher success rates than state-of-the-art baselines, highlighting improved robustness and compositional generalization in long-horizon tasks.


Misalignment from Treating Means as Ends

Marklund, Henrik, Infanger, Alex, Van Roy, Benjamin

arXiv.org Artificial Intelligence

Reward functions, learned or manually specified, are rarely perfect. Instead of accurately expressing human goals, these reward functions are often distorted by human beliefs about how best to achieve those goals. Specifically, these reward functions often express a combination of the human's terminal goals -- those which are ends in themselves -- and the human's instrumental goals -- those which are means to an end. We formulate a simple example in which even slight conflation of instrumental and terminal goals results in severe misalignment: optimizing the misspecified reward function results in poor performance when measured by the true reward function. This example distills the essential properties of environments that make reinforcement learning highly sensitive to conflation of instrumental and terminal goals. We discuss how this issue can arise with a common approach to reward learning and how it can manifest in real environments.