Goto

Collaborating Authors

 grandfather



Swedish Death Cleaning, but for Your Digital Life

WIRED

The art of ordering and culling your possessions before you die should extend to your documents, photos, and digital accounts. Digital generated image of semi transparent multiple data server discs on white background. After Adam Liljenberg's grandmother died, his grandfather was ready to downsize and move into an assisted living facility. As Swedes, they were familiar with Swedish death cleaning, the idea that as you near the end of life, you declutter and organize your belongings so as not to burden those who survive you. When Liljenberg arrived to help his grandfather sort through his possessions, he didn't expect to be rescuing digital photos off a phone full of malware.


Silent Tokens, Loud Effects: Padding in LLMs

Himelstein, Rom, LeVi, Amit, Belinkov, Yonatan, Mendelson, Avi

arXiv.org Artificial Intelligence

Padding tokens are widely used in large language models (LLMs) to equalize sequence lengths during batched inference. While they should be fully masked, implementation errors can cause them to influence computation, and the extent of this influence is not well understood. We systematically study this effect across three open-source model families (Llama, Gemma, Qwen), inserting controlled amounts of padding and evaluating outcomes along four axes: activations, generation quality, bias, and safety. Even small amounts of padding shift hidden representations, degrade quality in smaller models, alter bias in unpredictable ways, and weaken safety guardrails. These findings demonstrate that padding is not a harmless detail but a robustness risk that must be carefully handled in deployment.


Disney's grandchildren divided over new animatronic of Walt as one calls it 'dehumanizing'

FOX News

While at the park, the service members had the chance to explore attractions and participate in Disneyland's daily flag ceremony. Disney's Imagineers are working on a new animatronic of iconic American visionary Walt Disney, but some members of his family have opposing views about whether it celebrates his legacy or dehumanizes him. Disney's Main Street Opera House plans to unveil a new theme park attraction called Walt Disney – A Magical Life, featuring an audio-animatronic of the company's founder. But Joanna Miller, one of Disney's grandchildren, slammed the idea of an animatronic as "dehumanizing" in a viral Facebook post. Among her claims, she suggested that her grandfather had told early Imagineer Sam McKim he never wanted to be commemorated with an animatronic.


First Chinese typewriter rediscovered in grandfather's basement

Popular Science

A unique experimental typewriter stored in a New York state basement for decades turned out to be a one-of-a-kind piece of communications history. According to an announcement from Stanford University, historians and one unsuspecting granddaughter have rediscovered the long-missing MingKwai machine. Earlier this year, Jennifer Felix and her husband were working to clean out her recently deceased grandfather's home when they came across a large, extremely heavy typewriting device. However, instead of a more traditional setup the contraption featured five rows of keys topped with Chinese characters. After reaching out for help online, Felix realized her grandfather had been the owner of the MingKwai--one man's innovative, if ultimately doomed, attempt to incorporate the Chinese language onto a mechanical typewriter.


KnowLogic: A Benchmark for Commonsense Reasoning via Knowledge-Driven Data Synthesis

Zhan, Weidong, Wang, Yue, Hu, Nan, Xiao, Liming, Ma, Jingyuan, Qin, Yuhang, Li, Zheng, Yang, Yixin, Deng, Sirui, Ding, Jinkun, Ma, Wenhan, Li, Rui, Luo, Weilin, Liu, Qun, Sui, Zhifang

arXiv.org Artificial Intelligence

Current evaluations of commonsense reasoning in LLMs are hindered by the scarcity of natural language corpora with structured annotations for reasoning tasks. To address this, we introduce KnowLogic, a benchmark generated through a knowledge-driven synthetic data strategy. KnowLogic integrates diverse commonsense knowledge, plausible scenarios, and various types of logical reasoning. One of the key advantages of KnowLogic is its adjustable difficulty levels, allowing for flexible control over question complexity. It also includes fine-grained labels for in-depth evaluation of LLMs' reasoning abilities across multiple dimensions. Our benchmark consists of 3,000 bilingual (Chinese and English) questions across various domains, and presents significant challenges for current LLMs, with the highest-performing model achieving only 69.57\%. Our analysis highlights common errors, such as misunderstandings of low-frequency commonsense, logical inconsistencies, and overthinking. This approach, along with our benchmark, provides a valuable tool for assessing and enhancing LLMs' commonsense reasoning capabilities and can be applied to a wide range of knowledge domains.


Dehallucinating Parallel Context Extension for Retrieval-Augmented Generation

Ma, Zexiong, An, Shengnan, Lin, Zeqi, Zou, Yanzhen, Lou, Jian-Guang, Xie, Bing

arXiv.org Artificial Intelligence

Large language models (LLMs) are susceptible to generating hallucinated information, despite the integration of retrieval-augmented generation (RAG). Parallel context extension (PCE) is a line of research attempting to effectively integrating parallel (unordered) contexts, while it still suffers from hallucinations when adapted to RAG scenarios. In this paper, we propose DePaC (Dehallucinating Parallel Context Extension), which alleviates the hallucination problem with context-aware negative training and information-calibrated aggregation. DePaC is designed to alleviate two types of in-context hallucination: fact fabrication (i.e., LLMs present claims that are not supported by the contexts) and fact omission (i.e., LLMs fail to present claims that can be supported by the contexts). Specifically, (1) for fact fabrication, we apply the context-aware negative training that fine-tunes the LLMs with negative supervisions, thus explicitly guiding the LLMs to refuse to answer when contexts are not related to questions; (2) for fact omission, we propose the information-calibrated aggregation which prioritizes context windows with higher information increment from their contexts. The experimental results on nine RAG tasks demonstrate that DePaC significantly alleviates the two types of hallucination and consistently achieves better performances on these tasks.


Evaluating and Mitigating Social Bias for Large Language Models in Open-ended Settings

Liu, Zhao

arXiv.org Artificial Intelligence

Current social bias benchmarks for Large Language Models (LLMs) primarily rely on pre-defined question formats like multiple-choice, limiting their ability to reflect the complexity and open-ended nature of real-world interactions. To address this gap, we extend an existing BBQ dataset introduced by incorporating fill-in-the-blank and short-answer question types, designed to evaluate biases in an open-ended setting. Our finding reveals that LLMs tend to produce responses that are more biased against certain protected attributes, like age and socio-economic status. On the other hand, these biased outputs produced by LLMs can serve as valuable contexts and chains of thought for debiasing. Our debiasing approach combined zero-shot, few-shot, and chain-of-thought could significantly reduce the level of bias to almost 0. We open-source our evaluation and debiasing code hoping to encourage further measurements and mitigation of bias and stereotype in LLMs.


I experienced a 'time slip' that doctors say aren't possible

Daily Mail - Science & tech

Sebastian Garrido was traveling to visit his dying grandfather in the hospital when he had what he calls a'time slip' that changed his views on what happens after we die. Often dramatized in science fiction, a'time slip' is defined as a moment when someone accidentally travels through time -- but Garrido said his all too real'time slip' hit him on the street when he noticed a mysterious figure standing nearby. 'Fancy meeting you here, everything will be okay. Tell your dad I'll be fine,' the man said before disappearing. During the eerie encounter, Garrido, 26, said he'got goosebumps and then threw up.'


a486cd07e4ac3d270571622f4f316ec5-Paper.pdf

Neural Information Processing Systems

The blind application of machine learning runs the risk of amplifying biases present in data. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. We show that even word embeddings trained on Google News articles exhibit female/male gender stereotypes to a disturbing extent. This raises concerns because their widespread use, as we describe, often tends to amplify these biases. Geometrically, gender bias is first shown to be captured by a direction in the word embedding. Second, gender neutral words are shown to be linearly separable from gender definition words in the word embedding. Using these properties, we provide a methodology for modifying an embedding to remove gender stereotypes, such as the association between the words receptionist and female, while maintaining desired associations such as between the words queen and female. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduce gender bias in embeddings while preserving the its useful properties such as the ability to cluster related concepts and to solve analogy tasks. The resulting embeddings can be used in applications without amplifying gender bias.