Goto

Collaborating Authors

 Large Language Model


Grammar-AlignedDecoding

Neural Information Processing Systems

Specifically, ingrammar-constrained decoding(GCD), the LLM'soutput must follow agiven grammar. Our algorithm uses prior sample outputs to soundly overapproximate the future grammaticality of different output prefixes.








PRODIGY: Enabling In-context Learning Over Graphs

Neural Information Processing Systems

While large language models have demonstrated this ability, how in-context learning could be performed over graphs is unexplored. In this paper, we develop Pr etraining O ver D iverse I n-Context G raph S y stems (PRODIGY), the first pretraining framework that enables in-context learning over graphs.


Gaussian Match-and-Copy: A Minimalist Benchmark for Studying Transformer Induction

arXiv.org Machine Learning

Match-and-copy is a core retrieval primitive used at inference time by large language models to retrieve a matching token from the context then copy its successor. Yet, understanding how this behavior emerges on natural data is challenging because retrieval and memorization are entangled. To disentangle the two, we introduce Gaussian Match-and-Copy (GMC), a minimalist benchmark that isolates long-range retrieval through pure second-order correlation signals. Numerical investigations show that this task retains key qualitative aspects of how Transformers develop match-and-copy circuits in practice, and separates architectures by their retrieval capabilities. We also analyze the optimization dynamics in a simplified attention setting. Although many solutions are a priori possible under a regression objective, including ones that do not implement retrieval, we identify an implicit-bias regime in which gradient descent drives the parameters to diverge while their direction aligns with the max-margin separator, yielding hard match selection. We prove this max-margin alignment for GD trajectories that reach vanishing empirical loss under explicit technical conditions.


Beyond Arrow: From Impossibility to Possibilities in Multi-Criteria Benchmarking

arXiv.org Machine Learning

Modern benchmarks such as HELM MMLU account for multiple metrics like accuracy, robustness and efficiency. When trying to turn these metrics into a single ranking, natural aggregation procedures can become incoherent or unstable to changes in the model set. We formalize this aggregation as a social choice problem where each metric induces a preference ranking over models on each dataset, and a benchmark operator aggregates these votes across metrics. While prior work has focused on Arrow's impossibility result, we argue that the impossibility often originates from pathological examples and identify sufficient conditions under which these disappear, and meaningful multi-criteria benchmarking becomes possible. In particular, we deal with three restrictions on the combinations of rankings and prove that on single-peaked, group-separable and distance-restricted preferences, the benchmark operator allows for the construction of well-behaved rankings of the involved models. Empirically, we investigate several modern benchmark suites like HELM MMLU and verify which structural conditions are fulfilled on which benchmark problems.