AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

Grammar-AlignedDecoding

Neural Information Processing SystemsFeb-10-2026, 04:06:57 GMT

Specifically, ingrammar-constrained decoding(GCD), the LLM'soutput must follow agiven grammar. Our algorithm uses prior sample outputs to soundly overapproximate the future grammaticality of different output prefixes.

artificial intelligence, large language model, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.04)
Asia > Singapore (0.04)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.37)

Add feedback

Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Neural Information Processing SystemsFeb-10-2026, 03:21:04 GMT

Work done as a visiting student at MIT. 38th Conference on Neural Information Processing Systems (NeurIPS 2024).

large language model, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)

Genre: Research Report > Experimental Study (0.92)

Industry:

Energy (0.67)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

2ae6b2bdf3a179e3e24129e2c54bd871-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 02:59:53 GMT

original performance 0, performance 0, performance aal, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.58)

Add feedback

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Neural Information Processing SystemsFeb-10-2026, 02:57:56 GMT

goldfish loss, memorization, training data, (16 more...)

Neural Information Processing Systems

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(5 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Law (0.93)
Government > Regional Government > North America Government > United States Government (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

LearningtoFollowInstructionsinText-BasedGames

Neural Information Processing SystemsFeb-10-2026, 02:19:33 GMT

In this paper we study instruction following in text-based games and propose an approach that advances theprevious stateoftheart.

large language model, logic & formal reasoning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.05)
Africa > Ethiopia (0.04)
(7 more...)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.46)

Add feedback

Cross-Care: AssessingtheHealthcareImplications ofPre-trainingDataonLanguageModelBias

Neural Information Processing SystemsFeb-10-2026, 01:33:30 GMT

Intrinsic evaluations focus on the inherent properties of the model, while extrinsic evaluations measure biases in the context of specific tasks.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.67)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Find What You Want: Learning Demand-conditioned Object Attribute Space for Demand-driven Navigation

Neural Information Processing SystemsFeb-10-2026, 00:56:31 GMT

The task of Visual Object Navigation (VON) involves an agent's ability to locate

large language model, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

PRODIGY: Enabling In-context Learning Over Graphs

Neural Information Processing SystemsFeb-10-2026, 00:36:47 GMT

While large language models have demonstrated this ability, how in-context learning could be performed over graphs is unexplored. In this paper, we develop Pr etraining O ver D iverse I n-Context G raph S y stems (PRODIGY), the first pretraining framework that enables in-context learning over graphs.

data mining, large language model, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)

Add feedback

Gaussian Match-and-Copy: A Minimalist Benchmark for Studying Transformer Induction

Gonon, Antoine, Cordonnier, Alexandre, Boumal, Nicolas

arXiv.org Machine LearningFeb-10-2026

Match-and-copy is a core retrieval primitive used at inference time by large language models to retrieve a matching token from the context then copy its successor. Yet, understanding how this behavior emerges on natural data is challenging because retrieval and memorization are entangled. To disentangle the two, we introduce Gaussian Match-and-Copy (GMC), a minimalist benchmark that isolates long-range retrieval through pure second-order correlation signals. Numerical investigations show that this task retains key qualitative aspects of how Transformers develop match-and-copy circuits in practice, and separates architectures by their retrieval capabilities. We also analyze the optimization dynamics in a simplified attention setting. Although many solutions are a priori possible under a regression objective, including ones that do not implement retrieval, we identify an implicit-bias regime in which gradient descent drives the parameters to diverge while their direction aligns with the max-margin separator, yielding hard match selection. We prove this max-margin alignment for GD trajectories that reach vanishing empirical loss under explicit technical conditions.

large language model, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

2602.07562

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(10 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Beyond Arrow: From Impossibility to Possibilities in Multi-Criteria Benchmarking

Gordienko, Polina, Jansen, Christoph, Rodemann, Julian, Schollmeyer, Georg

arXiv.org Machine LearningFeb-10-2026

Modern benchmarks such as HELM MMLU account for multiple metrics like accuracy, robustness and efficiency. When trying to turn these metrics into a single ranking, natural aggregation procedures can become incoherent or unstable to changes in the model set. We formalize this aggregation as a social choice problem where each metric induces a preference ranking over models on each dataset, and a benchmark operator aggregates these votes across metrics. While prior work has focused on Arrow's impossibility result, we argue that the impossibility often originates from pathological examples and identify sufficient conditions under which these disappear, and meaningful multi-criteria benchmarking becomes possible. In particular, we deal with three restrictions on the combinations of rankings and prove that on single-peaked, group-separable and distance-restricted preferences, the benchmark operator allows for the construction of well-behaved rankings of the involved models. Empirically, we investigate several modern benchmark suites like HELM MMLU and verify which structural conditions are fulfilled on which benchmark problems.

large language model, machine learning, ranking, (19 more...)

arXiv.org Machine Learning

2602.07593

Country: