Goto

Collaborating Authors

 Media


Dogs can learn and remember how toys work

Popular Science

The process is similar to how human babies learn that bowls and spoons are used to eat. Breakthroughs, discoveries, and DIY tips sent every weekday. We've long known that dogs are pretty smart. They may "picture" objects in their heads just like us, communicate with button boards, and can even understand fairly complicated words . They also appear to categorize objects by function, understanding how similar types of toys work, even if they don't look alike.


Combating Biomedical Misinformation through Multi-modal Claim Detection and Evidence-based Verification

arXiv.org Artificial Intelligence

Misinformation in healthcare, from vaccine hesitancy to unproven treatments, poses risks to public health and trust in medical systems. While machine learning and natural language processing have advanced automated fact-checking, validating biomedical claims remains uniquely challenging due to complex terminology, the need for domain expertise, and the critical importance of grounding in scientific evidence. We introduce CER (Combining Evidence and Reasoning), a novel framework for biomedical fact-checking that integrates scientific evidence retrieval, reasoning via large language models, and supervised veracity prediction. By integrating the text-generation capabilities of large language models with advanced retrieval techniques for high-quality biomedical scientific evidence, CER effectively mitigates the risk of hallucinations, ensuring that generated outputs are grounded in verifiable, evidence-based sources. Evaluations on expert-annotated datasets (HealthFC, BioASQ-7b, SciFact) demonstrate state-of-the-art performance and promising cross-dataset generalization. Code and data are released for transparency and reproducibility: https://github.com/PRAISELab-PicusLab/CER


Large Language Models Discriminate Against Speakers of German Dialects

arXiv.org Artificial Intelligence

Dialects represent a significant component of human culture and are found across all regions of the world. In Germany, more than 40% of the population speaks a regional dialect (Adler and Hansen, 2022). However, despite cultural importance, individuals speaking dialects often face negative societal stereotypes. We examine whether such stereotypes are mirrored by large language models (LLMs). We draw on the sociolinguistic literature on dialect perception to analyze traits commonly associated with dialect speakers. Based on these traits, we assess the dialect naming bias and dialect usage bias expressed by LLMs in two tasks: an association task and a decision task. To assess a model's dialect usage bias, we construct a novel evaluation corpus that pairs sentences from seven regional German dialects (e.g., Alemannic and Bavarian) with their standard German counterparts. We find that: (1) in the association task, all evaluated LLMs exhibit significant dialect naming and dialect usage bias against German dialect speakers, reflected in negative adjective associations; (2) all models reproduce these dialect naming and dialect usage biases in their decision making; and (3) contrary to prior work showing minimal bias with explicit demographic mentions, we find that explicitly labeling linguistic demographics--German dialect speakers--amplifies bias more than implicit cues like dialect usage.


DSCC-HS: A Dynamic Self-Reinforcing Framework for Hallucination Suppression in Large Language Models

arXiv.org Artificial Intelligence

Large Language Model (LLM) hallucination is a significant barrier to their reliable deployment. Current methods like Retrieval-Augmented Generation (RAG) are often reactive. We introduce **Dynamic Self-reinforcing Calibration for Hallucination Suppression (DSCC-HS)**, a novel, proactive framework that intervenes during autoregressive decoding. Inspired by dual-process cognitive theory, DSCC-HS uses a compact proxy model, trained in adversarial roles as a Factual Alignment Proxy (FAP) and a Hallucination Detection Proxy (HDP). During inference, these proxies dynamically steer a large target model by injecting a real-time steering vector, which is the difference between FAP and HDP logits, at each decoding step. This plug-and-play approach requires no modification to the target model. Our experiments on TruthfulQA and BioGEN show DSCC-HS achieves state-of-the-art performance. On TruthfulQA, it reached a 99.2% Factual Consistency Rate (FCR). On the long-form BioGEN benchmark, it attained the highest FActScore of 46.50. These results validate DSCC-HS as a principled and efficient solution for enhancing LLM factuality.


Improving Context Fidelity via Native Retrieval-Augmented Reasoning

arXiv.org Artificial Intelligence

Large language models (LLMs) often struggle with context fidelity, producing inconsistent answers when responding to questions based on provided information. Existing approaches either rely on expensive supervised fine-tuning to generate evidence post-answer or train models to perform web searches without necessarily improving utilization of the given context. We propose CARE, a novel native retrieval-augmented reasoning framework that teaches LLMs to explicitly integrate in-context evidence within their reasoning process with the model's own retrieval capabilities. Our method requires limited labeled evidence data while significantly enhancing both retrieval accuracy and answer generation performance through strategically retrieved in-context tokens in the reasoning chain. Extensive experiments on multiple real-world and counterfactual QA benchmarks demonstrate that our approach substantially outperforms supervised fine-tuning, traditional retrieval-augmented generation methods, and external retrieval solutions. This work represents a fundamental advancement in making LLMs more accurate, reliable, and efficient for knowledge-intensive tasks.


AgentCTG: Harnessing Multi-Agent Collaboration for Fine-Grained Precise Control in Text Generation

arXiv.org Artificial Intelligence

Although significant progress has been made in many tasks within the field of Natural Language Processing (NLP), Controlled Text Generation (CTG) continues to face numerous challenges, particularly in achieving fine-grained conditional control over generation. Additionally, in real scenario and online applications, cost considerations, scalability, domain knowledge learning and more precise control are required, presenting more challenge for CTG. This paper introduces a novel and scalable framework, AgentCTG, which aims to enhance precise and complex control over the text generation by simulating the control and regulation mechanisms in multi-agent workflows. We explore various collaboration methods among different agents and introduce an auto-prompt module to further enhance the generation effectiveness. AgentCTG achieves state-of-the-art results on multiple public datasets. To validate its effectiveness in practical applications, we propose a new challenging Character-Driven Rewriting task, which aims to convert the original text into new text that conform to specific character profiles and simultaneously preserve the domain knowledge. When applied to online navigation with role-playing, our approach significantly enhances the driving experience through improved content delivery. By optimizing the generation of contextually relevant text, we enable a more immersive interaction within online communities, fostering greater personalization and user engagement.


New species of coral named after Chewbacca

Popular Science

Breakthroughs, discoveries, and DIY tips sent every weekday. While it's not helping fly the Millennium Falcon, marine biologists have discovered a new type of deep-sea coral in the western Pacific Ocean that bears a striking similarity to a certain beloved character. A new scientific analysis of the species initially documented almost two decades ago indicates that is its own species, with long, hairlike branches that live up to its namesake Wookie. Ten years later, another example was documented close to the Mariana Trench . But it would take a few more years before University of Hawai'i ecologist Les Watling noticed the strange, ethereal sight while reviewing research from some of his colleagues.


Would you buy the world's first personal robocar?

FOX News

Silicon Valley startup Tensor plans to sell the world's first personal robocar, allowing consumers to own self-driving cars by 2026.


The real reason our weather is going to the dogs

New Scientist

Feedback was amazed to hear that dog ownership could cause a hurricane across the other side of the world. Or are we barking up the wrong tree? Kristian Steensen Nielsen seems like a sensible type. A researcher at the Copenhagen Business School in Denmark, he studies "the role of behavior change in mitigating climate change and conserving biodiversity". In other words, how can we make our lives more environmentally friendly, and how and when do those changes scale up to become truly effective?


262 million birds forecast to take to the skies tonight

Popular Science

BirdCast can help you follow their fall migration. Breakthroughs, discoveries, and DIY tips sent every weekday. Open up the weather radar and you might see what looks like precipitation when it's not raining at all. Those bright green spots are often birds during their annual fall migration -and you can follow along. According to BirdCast, 262 million birds are predicted to hit the skies tonight alone.