vocab
Compact Proofs of Model Performance via Mechanistic Interpretability
We propose using mechanistic interpretability – techniques for reverse engineering model weights into human-interpretable algorithms – to derive and compactly prove formal guarantees on model performance. We prototype this approach by formally proving accuracy lower bounds for a small transformer trained on Max-of-K, validating proof transferability across 151 random seeds and four values of K. We create 102 different computer-assisted proof strategies and assess their length and tightness of bound on each of our models. Using quantitative metrics, we find that shorter proofs seem to require and provide more mechanistic understanding. Moreover, we find that more faithful mechanistic understanding leads to tighter performance bounds. We confirm these connections by qualitatively examining a subset of our proofs. Finally, we identify compounding structureless errors as a key challenge for using mechanistic interpretability to generate compact proofs on model performance.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
- (3 more...)
- Information Technology > Software (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Provable Benefit of Curriculum in Transformer Tree-Reasoning Post-Training
Bu, Dake, Huang, Wei, Han, Andi, Nitanda, Atsushi, Wong, Hau-San, Zhang, Qingfu, Suzuki, Taiji
Recent curriculum techniques in the post-training stage of LLMs have been widely observed to outperform non-curriculum approaches in enhancing reasoning performance, yet a principled understanding of why and to what extent they work remains elusive. To address this gap, we develop a theoretical framework grounded in the intuition that progressively learning through manageable steps is more efficient than directly tackling a hard reasoning task, provided each stage stays within the model's effective competence. Under mild complexity conditions linking consecutive curriculum stages, we show that curriculum post-training avoids the exponential complexity bottleneck. To substantiate this result, drawing insights from the Chain-of-Thoughts (CoTs) solving mathematical problems such as Countdown and parity, we model CoT generation as a states-conditioned autoregressive reasoning tree, define a uniform-branching base model to capture pretrained behavior, and formalize curriculum stages as either depth-increasing (longer reasoning chains) or hint-decreasing (shorter prefixes) subtasks. Our analysis shows that, under outcome-only reward signals, reinforcement learning finetuning achieves high accuracy with polynomial sample complexity, whereas direct learning suffers from an exponential bottleneck. We further establish analogous guarantees for test-time scaling, where curriculum-aware querying reduces both reward oracle calls and sampling cost from exponential to polynomial order.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Hong Kong (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Compact Proofs of Model Performance via Mechanistic Interpretability
We propose using mechanistic interpretability – techniques for reverse engineering model weights into human-interpretable algorithms – to derive and compactly prove formal guarantees on model performance. We prototype this approach by formally proving accuracy lower bounds for a small transformer trained on Max-of-K, validating proof transferability across 151 random seeds and four values of K. We create 102 different computer-assisted proof strategies and assess their length and tightness of bound on each of our models. Using quantitative metrics, we find that shorter proofs seem to require and provide more mechanistic understanding. Moreover, we find that more faithful mechanistic understanding leads to tighter performance bounds. We confirm these connections by qualitatively examining a subset of our proofs. Finally, we identify compounding structureless errors as a key challenge for using mechanistic interpretability to generate compact proofs on model performance.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
- (3 more...)
- Information Technology > Software (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Syntactic Learnability of Echo State Neural Language Models at Scale
Ueda, Ryo, Kuribayashi, Tatsuki, Kando, Shunsuke, Inui, Kentaro
What is a neural model with minimum architectural complexity that exhibits reasonable language learning capability? To explore such a simple but sufficient neural language model, we revisit a basic reservoir computing (RC) model, Echo State Network (ESN), a restricted class of simple Recurrent Neural Networks. Our experiments showed that ESN with a large hidden state is comparable or superior to Transformer in grammaticality judgment tasks when trained with about 100M words, suggesting that architectures as complex as that of Transformer may not always be necessary for syntactic learning.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (6 more...)
Notes on the Mathematical Structure of GPT LLM Architectures
Introduction When considered from a purely mathematical point of view, the building and training of a large (transformer) language model (LLM) is the construction of a function - which can be taken to be a map from some euclidean space to another - that has certain interesting properties. And therefore, from the point of view of a mathematician, it may be frustrating to find that many key papers announcing significant new LLMs seem reluctant to simply spell out the details of the function that they have constructed in plain mathematical language or indeed even in complete pseudo-code (and the latter form of this complaint appears to be one of the motivations behind a recent article of Phuong and Hutter [1]). Here, we seek to give a relatively'pure' mathematical description of the architecture of a GPT-3-style LLM. There is then a separate process - the training of the model - in which a particular value θ Θ is selected using a training algorithm. We will draw attention to such parameters as we introduce them, as opposed to attempting to give a definition of Θ up front.
AutoIRT: Calibrating Item Response Theory Models with Automated Machine Learning
Sharpnack, James, Mulcaire, Phoebe, Bicknell, Klinton, LaFlair, Geoff, Yancey, Kevin
Item response theory (IRT) is a class of interpretable factor models that are widely used in computerized adaptive tests (CATs), such as language proficiency tests. Traditionally, these are fit using parametric mixed effects models on the probability of a test taker getting the correct answer to a test item (i.e., question). Neural net extensions of these models, such as BertIRT, require specialized architectures and parameter tuning. We propose a multistage fitting procedure that is compatible with out-of-the-box Automated Machine Learning (AutoML) tools. It is based on a Monte Carlo EM (MCEM) outer loop with a two stage inner loop, which trains a non-parametric AutoML grade model using item features followed by an item specific parametric model. This greatly accelerates the modeling workflow for scoring tests. We demonstrate its effectiveness by applying it to the Duolingo English Test, a high stakes, online English proficiency test. We show that the resulting model is typically more well calibrated, gets better predictive performance, and more accurate scores than existing methods (non-explanatory IRT models and explanatory IRT models like BERT-IRT). Along the way, we provide a brief survey of machine learning methods for calibration of item parameters for CATs.
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (5 more...)
- Research Report (0.82)
- Overview (0.68)
PyMarian: Fast Neural Machine Translation and Evaluation in Python
Gowda, Thamme, Grundkiewicz, Roman, Rippeth, Elijah, Post, Matt, Junczys-Dowmunt, Marcin
The deep learning language of choice these days is Python; measured by factors such as available libraries and technical support, it is hard to beat. At the same time, software written in lower-level programming languages like C++ retain advantages in speed. We describe a Python interface to Marian NMT, a C++-based training and inference toolkit for sequence-to-sequence models, focusing on machine translation. This interface enables models trained with Marian to be connected to the rich, wide range of tools available in Python. A highlight of the interface is the ability to compute state-of-the-art COMET metrics from Python but using Marian's inference engine, with a speedup factor of up to 7.8$\times$ the existing implementations. We also briefly spotlight a number of other integrations, including Jupyter notebooks, connection with prebuilt models, and a web app interface provided with the package. PyMarian is available in PyPI via $\texttt{pip install pymarian}$.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Portugal > Lisbon > Lisbon (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (5 more...)
On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation
Chen, Liang, Ma, Shuming, Zhang, Dongdong, Wei, Furu, Chang, Baobao
While multilingual neural machine translation has achieved great success, it suffers from the off-target issue, where the translation is in the wrong language. This problem is more pronounced on zero-shot translation tasks. In this work, we find that failing in encoding discriminative target language signal will lead to off-target and a closer lexical distance (i.e., KL-divergence) between two languages' vocabularies is related with a higher off-target rate. We also find that solely isolating the vocab of different languages in the decoder can alleviate the problem. Motivated by the findings, we propose Language Aware Vocabulary Sharing (LAVS), a simple and effective algorithm to construct the multilingual vocabulary, that greatly alleviates the off-target problem of the translation model by increasing the KL-divergence between languages. We conduct experiments on a multilingual machine translation benchmark in 11 languages. Experiments show that the off-target rate for 90 translation tasks is reduced from 29\% to 8\%, while the overall BLEU score is improved by an average of 1.9 points without extra training cost or sacrificing the supervised directions' performance. We release the code at https://github.com/PKUnlp-icler/Off-Target-MNMT for reproduction.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Oceania > Samoa (0.05)
- North America > Dominican Republic (0.04)
- (9 more...)
Nonparametric Decoding for Generative Retrieval
Lee, Hyunji, Kim, Jaeyoung, Chang, Hoyeon, Oh, Hanseok, Yang, Sohee, Karpukhin, Vlad, Lu, Yi, Seo, Minjoon
The generative retrieval model depends solely on the information encoded in its model parameters without external memory, its information capacity is limited and fixed. To overcome the limitation, we propose Nonparametric Decoding (Np Decoding) which can be applied to existing generative retrieval models. Np Decoding uses nonparametric contextualized vocab embeddings (external memory) rather than vanilla vocab embeddings as decoder vocab embeddings. By leveraging the contextualized vocab embeddings, the generative retrieval model is able to utilize both the parametric and nonparametric space. Evaluation over 9 datasets (8 single-hop and 1 multi-hop) in the document retrieval task shows that applying Np Decoding to generative retrieval models significantly improves the performance. We also show that Np Decoding is data- and parameter-efficient, and shows high performance in the zero-shot setting.
- Europe > United Kingdom > England > Lincolnshire (0.15)
- North America > United States (0.14)
- Africa > South Africa > Western Cape > Cape Town (0.05)
- (4 more...)
Understanding Text Classification Data and Models Using Aggregated Input Salience
Ebert, Sebastian, Jakobovits, Alice Shoshana, Filippova, Katja
Realizing when a model is right for a wrong reason is not trivial and requires a significant effort by model developers. In some cases an input salience method, which highlights the most important parts of the input, may reveal problematic reasoning. But scrutinizing highlights over many data instances is tedious and often infeasible. Furthermore, analyzing examples in isolation does not reveal general patterns in the data or in the model's behavior. In this paper we aim to address these issues and go from understanding single examples to understanding entire datasets and models. The methodology we propose is based on aggregated salience maps, to which we apply clustering, nearest neighbor search and visualizations. Using this methodology we address multiple distinct but common model developer needs by showing how problematic data and model behavior can be identified and explained -- a necessary first step for improving the model.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom (0.14)
- (15 more...)