Goto

Collaborating Authors

 acorn


Noise-Robust Abstractive Compression in Retrieval-Augmented Language Models

arXiv.org Artificial Intelligence

However, retrieved documents often include information that is either irrelevant to answering the query or misleading due to factual incorrect content, despite having high relevance scores. This behavior indicates that abstractive compressors are more likely to omit important information essential for the correct answer, especially in long contexts where attention dispersion occurs. To address this issue, we categorize retrieved documents in a more fine-grained manner and propose Abstractive Compression Robust against Noise (ACoRN), which introduces two novel training steps. First, we use offline data augmentation on the training dataset to enhance compressor robustness against two distinct types of retrieval noise. Second, since the language model based compressor cannot fully utilize information from multiple retrieved documents and exhibits positional bias, we perform finetuning to generate summaries centered around key information that directly supports the correct answer. Our experiments demonstrate that T5-large, trained with ACoRN as a compressor, improves EM and F1 scores while preserving the answer string, which could serve as direct evidence.


Indigenous women engineered energy-efficient baby carriers

Popular Science

The technology helped them while harvesting the vast majority of their community's food. Apache, Navajo, and Shoshoni (pictured above) are only a few of the Indigenous tribes that utilized cradleboards. Breakthroughs, discoveries, and DIY tips sent every weekday. Indigenous women were technological trailblazers. But while lived experiences and communal histories have long supported this, they routinely fail to receive the credit they deserve .


ACoRN: Noise-Robust Abstractive Compression in Retrieval-Augmented Language Models

arXiv.org Artificial Intelligence

However, retrieved documents often include information that is either irrelevant to answering the query or misleading due to factual incorrect content, despite having high relevance scores. This behavior indicates that abstractive compressors are more likely to omit important information essential for the correct answer, especially in long contexts where attention dispersion occurs. T o address this issue, we categorize retrieved documents in a more fine-grained manner and propose Abstractive Compression Robust against Noise (ACoRN), which introduces two novel training steps. First, we use offline data augmentation on the training dataset to enhance compressor robustness against two distinct types of retrieval noise. Second, since the language model-based compressor cannot fully utilize information from multiple retrieved documents and exhibits positional bias, we perform fine-tuning to generate summaries centered around key information that directly supports the correct answer . Our experiments demonstrate that T5-large, trained with ACoRN as a compressor, improves EM and F1 scores while preserving the answer string, which could serve as direct evidence.


How squirrels actually find all their buried nuts

Popular Science

Every fall, squirrels hide hundreds of acorns--and use smell, memory, and even theft to get them back. Every fall, squirrels stash hundreds of nuts to survive the colder winter months. Breakthroughs, discoveries, and DIY tips sent every weekday. As someone who routinely "hides" things from myself--car keys, receipts, even my phone while I'm actively talking on it--I felt instantly validated by Sarah Silverman's joke that squirrels forget where they bury 80% of their nuts. "And that's how trees are planted!"


Local Coherence or Global Validity? Investigating RLVR Traces in Math Domains

arXiv.org Artificial Intelligence

Reinforcement Learning with Verifiable Rewards (RLVR)-based post-training of Large Language Models (LLMs) has been shown to improve accuracy on reasoning tasks and continues to attract significant attention. Existing RLVR methods, however, typically treat all tokens uniformly without accounting for token-level advantages. These methods primarily evaluate performance based on final answer correctness or Pass@K accuracy, and yet make claims about RL post-training leading to improved reasoning traces. This motivates our investigation into the effect of RL post-training on intermediate tokens which are not directly incentivized. To study this, we design an experimental setup using the GRPO algorithm with Qwen-2.5-0.5B model on the GSM8K dataset. We introduce trace coherence, a First-Order Logic (FOL)-based measure to capture the consistency of reasoning steps by identifying errors in the traces. We distinguish between trace validity and trace coherence, noting that the former implies logical soundness while the latter measures local coherence via lack of errors. Our results show that RL post-training overall improves trace coherence with the most significant gains on problems where the base model fails but the RL model succeeds. Surprisingly, RL enhances local coherence without necessarily producing valid or correct solutions. This highlights a crucial distinction: improved local coherence in reasoning steps does not guarantee final answer correctness. We argue that claims of improved reasoning via RL must be examined with care, as these may be based on improved trace coherence, which may not translate into fully valid mathematical proofs.


ACORN: Adaptive Contrastive Optimization for Safe and Robust Fine-Grained Robotic Manipulation

arXiv.org Artificial Intelligence

Embodied AI research has traditionally emphasized performance metrics such as success rate and cumulative reward, overlooking critical robustness and safety considerations that emerge during real-world deployment. In actual environments, agents continuously encounter unpredicted situations and distribution shifts, causing seemingly reliable policies to experience catastrophic failures, particularly in manipulation tasks. To address this gap, we introduce four novel safety-centric metrics that quantify an agent's resilience to environmental perturbations. Building on these metrics, we present Adaptive Contrastive Optimization for Robust Manipulation (ACORN), a plug-and-play algorithm that enhances policy robustness without sacrificing performance. ACORN leverages contrastive learning to simultaneously align trajectories with expert demonstrations while diverging from potentially unsafe behaviors. Our approach efficiently generates informative negative samples through structured Gaussian noise injection, employing a double perturbation technique that maintains sample diversity while minimizing computational overhead. Comprehensive experiments across diverse manipulation environments validate ACORN's effectiveness, yielding improvements of up to 23% in safety metrics under disturbance compared to baseline methods. These findings underscore ACORN's significant potential for enabling reliable deployment of embodied agents in safety-critical real-world applications.


Neural Lab's AirTouch brings gesture control to Windows and Android devices with just a webcam

Engadget

Some of the best tech we see at CES feels pulled straight from sci-fi. Yesterday at CES 2025, I tested out Neural Lab's AirTouch technology, which lets you interact with a display using hand gestures alone, exactly what movies like Minority Report and Iron Man promised. Of course, plenty of companies have delivered on varying forms of gesture control. Microsoft's Kinect is an early example while the Apple Watch's double tap feature and Vision Pro's pinch gestures are just two of many current iterations. But I was impressed with how well AirTouch delivered and, unlike most gesture technology out there, it requires no special equipment -- just a standard webcam -- and works with a wide range of devices.


Accurate Open-set Recognition for Memory Workload

arXiv.org Artificial Intelligence

The global DRAM (Dynamic Random Access Memory) market size is about tens of billions USD, and keeps increasing due to growing demand of DRAM in mobile devices, modern computers, selfdriving cars, etc. It is crucial to test DRAM using various workloads in verifying and guaranteeing DRAM quality. DRAM manufacturers utilize their known workloads for verification; however, it does not guarantee that DRAM works well for new workloads not known in advance. Therefore, it is necessary to detect new workloads to improve the quality of DRAM verification. The problem of detecting new workloads is formulated as an open-set recognition [19] task which classifies a test sample into the known classes or the unknown class, and identifies its class if it belongs to the known classes. A workload sequence contains a series of tuples with the command and the address information of memory accesses. To detect new workloads based on open-set recognition, we exploit a subsequence, a part of the entire sequence of a workload. Given a subsequence, we classify it into one of the known workload classes or identify it as the unknown class corresponding to new workloads.


Digital world-beater Arm needs a helping hand from Boris Johnson John Naughton

The Guardian

Last September, Nvidia, the American manufacturer of graphics processing chips, and the Japanese company SoftBank announced an agreement under which Nvidia would acquire the British chip designer Arm from SoftBank for $40bn. Since SoftBank had acquired Arm in 2016 for $32bn, you could say that a 25% profit on a five-year investment isn't to be sneezed at, especially if industry mutterings about SoftBank's crackpot investment strategy and Arm's internal difficulties with its China-based operation are to be believed. But even if one were foolish enough to sympathise with SoftBank's desire to climb out of the hole it had dug for itself, the idea that Arm should be sold to a US chip manufacturer is so daft that even Boris Johnson's administration had begun to smell a rat. And so on Monday it announced that the secretary of state for digital, culture, media and sport was "intervening in the sale on national security grounds", based on advice received "from officials across the investment security community". To which decision the only possible response is: what took him so long?


AI-Powered Savings Apps: A New Competitive Necessity For Banks

#artificialintelligence

While traditional banks advertise a 0.05% interest rate for deposits in their savings accounts, a growing number of consumers have turned to a new crop of mobile apps--automated or "self-driving" savings apps--to help them save. In a recent consumer study, Cornerstone Advisors found that savings apps like Acorns, Digit, and Qapital help consumers save an average of $600 a year above and beyond their regular level of savings--and one in five users saves more than $1,000. At a 0.05% interest rate, you'd need $1.2 million in a savings account in order to earn $600 in a year. These new apps help consumers figure out how much they could save (above and beyond what they're already saving)--and then take the money out of the user's checking account and put it in a savings account. The popularity of these tools shouldn't come as a surprise.