Well File:

Learning Transformer Programs

Neural Information Processing Systems

Recent research in mechanistic interpretability has attempted to reverse-engineer Transformer models by carefully inspecting network weights and activations. However, these approaches require considerable manual effort and still fall short of providing complete, faithful descriptions of the underlying algorithms. In this work, we introduce a procedure for training Transformers that are mechanistically interpretable by design. We build on RASP [Weiss et al., 2021], a programming language that can be compiled into Transformer weights. Instead of compiling human-written programs into Transformers, we design a modified Transformer that can be trained using gradient-based optimization and then automatically converted into a discrete, human-readable program. We refer to these models as Transformer Programs. To validate our approach, we learn Transformer Programs for a variety of problems, including an in-context learning task, a suite of algorithmic problems (e.g.





Entropy testing and its application to testing Bayesian networks

Neural Information Processing Systems

This paper studies the problem of entropy identity testing: given sample access to a distribution p and a fully described distribution q (both discrete distributions over a domain of size k), and the promise that either p = q or |H(p) H(q)| ฮต, where H() denotes the Shannon entropy, a tester needs to distinguish between the two cases with high probability.


Jungo Kasai

Neural Information Processing Systems

Q: How many home runs has Shohei Ohtani hit? Why was the dataset created? Q: How many home runs has Shohei Ohtani hit? QA was created to provide a to benchmark question answering at the dynamic platform that asks questions about the present time: answers (e.g., the number of current world, challenging QA systems to provide Shohei Ohtani's home runs) change in real time. QA may identify areas of potential research, such as improving how QA systems deal with unanswerable What are the instances?




Right this way: Can VLMs Guide Us to See More to Answer Questions?

Neural Information Processing Systems

In question-answering scenarios, humans can assess whether the available information is sufficient and seek additional information if necessary, rather than providing a forced answer. In contrast, Vision Language Models (VLMs) typically generate direct, one-shot responses without evaluating the sufficiency of the information. To investigate this gap, we identify a critical and challenging task in the Visual Question Answering (VQA) scenario: can VLMs indicate how to adjust an image when the visual information is insufficient to answer a question? This capability is especially valuable for assisting visually impaired individuals who often need guidance to capture images correctly. To evaluate this capability of current VLMs, we introduce a human-labeled dataset as a benchmark for this task. Additionally, we present an automated framework that generates synthetic training data by simulating "where to know" scenarios. Our empirical results show significant performance improvements in mainstream VLMs when fine-tuned with this synthetic data. This study demonstrates the potential to narrow the gap between information assessment and acquisition in VLMs, bringing their performance closer to humans.