AITopics | Grammars & Parsing

Collaborating Authors

Grammars & Parsing

News Overviews Instructional Materials AI-Alerts Classics

A Discussion of the Benchmark Selection for Evaluation

Neural Information Processing SystemsFeb-7-2026, 13:05:34 GMT

Given that NeSS achieves impressive results on synthetic natural language benchmarks in our evaluation, one question is whether it could also improve the performance on commonly used natural language datasets, e.g., large-scale machine translation benchmarks. However, note that most existing natural language benchmarks are not designed for evaluating the compositional generalization performance of models. Instead, the main challenge of those datasets is to handle the inherently ambiguous and potentially noisy natural language inputs. Specifically, their training and test sets are usually from the same distribution, and thus do not evaluate compositional generalization. As a result, we did not run experiments on these datasets.

machine learning, natural language, predictor, (18 more...)

Neural Information Processing Systems

Genre: Workflow (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.43)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.34)

Add feedback

12b1e42dc0746f22cf361267de07073f-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 13:05:16 GMT

We thank all reviewers for constructive comments. We added an ablation study on the SCAN length split to demonstrate its importance. For example, in the test set, there is a new pattern "jump around right thrice" that does not appear in the training set. Recursion and sequence manipulation supported by NeSS are critical to learn such parsing rules to generalize. NeSS is 100% in 2 runs, and 62.5% in 3 runs. When the model predicts the alternative translation, the exact match accuracy becomes lower.

accuracy, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.57)

Add feedback

0a4dc6dae338c9cb08947c07581f77a2-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 10:43:24 GMT

arxiv, edge transformer, transformer, (16 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Co-PLNet: A Collaborative Point-Line Network for Prompt-Guided Wireframe Parsing

Wang, Chao, Li, Xuanying, Dai, Cheng, Feng, Jinglei, Luo, Yuxiang, Ouyang, Yuqi, Qin, Hao

arXiv.org Machine LearningJan-27-2026

Wireframe parsing aims to recover line segments and their junctions to form a structured geometric representation useful for downstream tasks such as Simultaneous Localization and Mapping (SLAM). Existing methods predict lines and junctions separately and reconcile them post-hoc, causing mismatches and reduced robustness. We present Co-PLNet, a point-line collaborative framework that exchanges spatial cues between the two tasks, where early detections are converted into spatial prompts via a Point-Line Prompt Encoder (PLP-Encoder), which encodes geometric attributes into compact and spatially aligned maps. A Cross-Guidance Line Decoder (CGL-Decoder) then refines predictions with sparse attention conditioned on complementary prompts, enforcing point-line consistency and efficiency. Experiments on Wireframe and YorkUrban show consistent improvements in accuracy and robustness, together with favorable real-time efficiency, demonstrating our effectiveness for structured geometry perception.

machine learning, natural language, prediction, (16 more...)

arXiv.org Machine Learning

2601.18252

Country: Asia > China (0.15)

Genre: Research Report (0.40)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.97)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.66)

Add feedback

Grammar Prompting for Domain-Specific Language Generation with Large Language Models

Neural Information Processing SystemsDec-26-2025, 19:26:29 GMT

Large language models (LLMs) can learn to perform a wide range of natural language tasks from just a handful of in-context examples. However, for generating strings from highly structured languages (e.g., semantic parsing to complex domain-specific languages), it is challenging for the LLM to generalize from just a few exemplars. We propose \emph{grammar prompting}, a simple approach to enable LLMs to use external knowledge and domain-specific constraints, expressed through a grammar in Backus--Naur Form (BNF), during in-context learning. Grammar prompting augments each demonstration example with a specialized grammar that is minimally sufficient for generating the particular output example, where the specialized grammar is a subset of the full DSL grammar. For inference, the LLM first predicts a BNF grammar given a test input, and then generates the output according to the rules of the grammar. Experiments demonstrate that grammar prompting can enable LLMs to perform competitively on a diverse set of DSL generation tasks, including semantic parsing (SMCalFlow, Overnight, GeoQuery), PDDL planning, and SMILES-based molecule generation.

domain-specific language generation, grammar, grammar prompting, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing

Neural Information Processing SystemsDec-26-2025, 10:30:21 GMT

Recent work has shown that generation from a prompted or fine-tuned language model can perform well at semantic parsing when the output is constrained to be a valid semantic representation. We introduce BenchCLAMP, a Benchmark to evaluate Constrained LAnguage Model Parsing, that includes context-free grammars for seven semantic parsing datasets and two syntactic parsing datasets with varied output meaning representations, as well as a constrained decoding interface to generate only valid outputs covered by these grammars. We provide low, medium, and high resource splits for each dataset, allowing accurate comparison of various language models under different data regimes. Our benchmark supports evaluation of language models using prompt-based learning as well as fine-tuning.

benchclamp, language model, syntactic and semantic parsing, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs

Neural Information Processing SystemsDec-26-2025, 06:37:10 GMT

Text-to-SQL parsing, which aims at converting natural language instructions into executable SQLs, has gained increasing attention in recent years. In particular, GPT-4 and Claude-2 have shown impressive results in this task. However, most of the prevalent benchmarks, i.e., Spider, and WikiSQL, focus on database schema with few rows of database contents leaving the gap between academic study and real-world applications. To mitigate this gap, we present BIRD, a BIg benchmark for laRge-scale Database grounded in text-to-SQL tasks, containing 12,751 pairs of text-to-SQL data and 95 databases with a total size of 33.4 GB, spanning 37 professional domains. Our emphasis on database values highlights the new challenges of dirty database contents, external knowledge between NL questions and database contents, and SQL efficiency, particularly in the context of massive databases. To solve these problems, text-to-SQL models must feature database value comprehension in addition to semantic parsing. The experimental results demonstrate the significance of database values in generating accurate text-to-SQLs for big databases.

database interface, large-scale database grounded text-to-sql, name change, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.81)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.77)

Add feedback

Program Synthesis and Semantic Parsing with Learned Code Idioms

Neural Information Processing SystemsDec-25-2025, 23:57:47 GMT

Program synthesis of general-purpose source code from natural language specifications is challenging due to the need to reason about high-level patterns in the target program and low-level implementation details at the same time. In this work, we present Patois, a system that allows a neural program synthesizer to explicitly interleave high-level and low-level reasoning at every generation step. It accomplishes this by automatically mining common code idioms from a given corpus, incorporating them into the underlying language for neural synthesis, and training a tree-based neural synthesizer to use these idioms during code generation. We evaluate Patois on two complex semantic parsing datasets and show that using learned code idioms improves the synthesizer's accuracy.

learned code idiom, name change, program synthesis and semantic parsing, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.66)

Add feedback

QATCH: Benchmarking SQL-centric tasks with Table Representation Learning Models on Your Data

Neural Information Processing SystemsDec-25-2025, 16:22:46 GMT

Table Representation Learning (TRL) models are commonly pre-trained on large open-domain datasets comprising millions of tables and then used to address downstream tasks. Choosing the right TRL model to use on proprietary data can be challenging, as the best results depend on the content domain, schema, and data quality. Our purpose is to support end-users in testing TRL models on proprietary data in two established SQL-centric tasks, i.e., Question Answering (QA) and Semantic Parsing (SP). We present QATCH (Query-Aided TRL Checklist), a toolbox to highlight TRL models' strengths and weaknesses on relational tables unseen at training time. For an input table, QATCH automatically generates a testing checklist tailored to QA and SP. Checklist generation is driven by a SQL query engine that crafts tests of different complexity. This design facilitates inherent portability, allowing the checks to be used by alternative models. We also introduce a set of cross-task performance metrics evaluating the TRL model's performance over its output. Finally, we show how QATCH automatically generates tests for proprietary datasets to evaluate various state-of-the-art models including TAPAS, TAPEX, and CHATGPT.

benchmarking sql-centric task, name change, table representation learning model, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.60)

Add feedback

Filters

Collaborating Authors

Grammars & Parsing

16c0d78ef6a76b5c247113a4c9514059-Supplemental.pdf

A Discussion of the Benchmark Selection for Evaluation

12b1e42dc0746f22cf361267de07073f-AuthorFeedback.pdf

0a4dc6dae338c9cb08947c07581f77a2-Paper.pdf

Co-PLNet: A Collaborative Point-Line Network for Prompt-Guided Wireframe Parsing

Grammar Prompting for Domain-Specific Language Generation with Large Language Models

BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing

Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs

Program Synthesis and Semantic Parsing with Learned Code Idioms

QATCH: Benchmarking SQL-centric tasks with Table Representation Learning Models on Your Data