AITopics | nonterminal

Collaborating Authors

nonterminal

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep networks learn to parse uniform-depth context-free languages from local statistics

Parley, Jack T., Cagnetta, Francesco, Wyart, Matthieu

arXiv.org Machine LearningFeb-10-2026

Understanding how the structure of language can be learned from sentences alone is a central question in both cognitive science and machine learning. Studies of the internal representations of Large Language Models (LLMs) support their ability to parse text when predicting the next word, while representing semantic notions independently of surface form. Yet, which data statistics make these feats possible, and how much data is required, remain largely unknown. Probabilistic context-free grammars (PCFGs) provide a tractable testbed for studying these questions. However, prior work has focused either on the post-hoc characterization of the parsing-like algorithms used by trained networks; or on the learnability of PCFGs with fixed syntax, where parsing is unnecessary. Here, we (i) introduce a tunable class of PCFGs in which both the degree of ambiguity and the correlation structure across scales can be controlled; (ii) provide a learning mechanism -- an inference algorithm inspired by the structure of deep convolutional networks -- that links learnability and sample complexity to specific language statistics; and (iii) validate our predictions empirically across deep convolutional and transformer-based architectures. Overall, we propose a unifying framework where correlations at different scales lift local ambiguities, enabling the emergence of hierarchical representations of the data.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2602.06065

Country:

Europe > Switzerland > Vaud > Lausanne (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(10 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

5c5a93a042235058b1ef7b0ac1e11b67-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 21:16:25 GMT

architecture, program derivation graph, synthesis, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
Europe > France (0.04)
(8 more...)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.71)
(2 more...)

Add feedback

Probability Distribution Collapse: A Critical Bottleneck to Compact Unsupervised Neural Grammar Induction

Park, Jinwook, Kim, Kangil

arXiv.org Artificial IntelligenceSep-26-2025

Unsupervised neural grammar induction aims to learn interpretable hierarchical structures from language data. However, existing models face an expressiveness bottleneck, often resulting in unnecessarily large yet underperforming grammars. We identify a core issue, $\textit{probability distribution collapse}$, as the underlying cause of this limitation. We analyze when and how the collapse emerges across key components of neural parameterization and introduce a targeted solution, $\textit{collapse-relaxing neural parameterization}$, to mitigate it. Our approach substantially improves parsing performance while enabling the use of significantly more compact grammars across a wide range of languages, as demonstrated through extensive empirical analysis.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2509.20734

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

5c5a93a042235058b1ef7b0ac1e11b67-Paper.pdf

Neural Information Processing SystemsAug-14-2025, 17:14:23 GMT

architecture, program derivation graph, synthesis, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > France (0.04)
(9 more...)

Genre: Research Report (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.71)
(2 more...)

Add feedback

Towards Semantic Markup of Mathematical Documents via User Interaction

Vrečar, Luka, Wells, Joe, Kamareddine, Fairouz

arXiv.org Artificial IntelligenceAug-5-2024

Mathematical documents written in LaTeX often contain ambiguities. We can resolve some of them via semantic markup using, e.g., sTeX, which also has other potential benefits, such as interoperability with computer algebra systems, proof systems, and increased accessibility. However, semantic markup is more involved than "regular" typesetting and presents a challenge for authors of mathematical documents. We aim to smooth out the transition from plain LaTeX to semantic markup by developing semi-automatic tools for authors. In this paper we present an approach to semantic markup of formulas by (semi-)automatically generating grammars from existing sTeX macro definitions and parsing mathematical formulas with them. We also present a GUI-based tool for the disambiguation of parse results and showcase its functionality and potential using a grammar for parsing untyped $\lambda$-terms.

formula, grammar, macro, (15 more...)

arXiv.org Artificial Intelligence

2408.04656

Country:

Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > Switzerland (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Structural Optimization Ambiguity and Simplicity Bias in Unsupervised Neural Grammar Induction

Park, Jinwook, Kim, Kangil

arXiv.org Artificial IntelligenceJul-23-2024

Neural parameterization has significantly advanced unsupervised grammar induction. However, training these models with a traditional likelihood loss for all possible parses exacerbates two issues: 1) $\textit{structural optimization ambiguity}$ that arbitrarily selects one among structurally ambiguous optimal grammars despite the specific preference of gold parses, and 2) $\textit{structural simplicity bias}$ that leads a model to underutilize rules to compose parse trees. These challenges subject unsupervised neural grammar induction (UNGI) to inevitable prediction errors, high variance, and the necessity for extensive grammars to achieve accurate predictions. This paper tackles these issues, offering a comprehensive analysis of their origins. As a solution, we introduce $\textit{sentence-wise parse-focusing}$ to reduce the parse pool per sentence for loss evaluation, using the structural bias from pre-trained parsers on the same dataset. In unsupervised parsing benchmark tests, our method significantly improves performance while effectively reducing variance and bias toward overly simplistic parses. Our research promotes learning more compact, accurate, and consistent explicit grammars, facilitating better interpretability.

grammar, parse tree, probability, (15 more...)

arXiv.org Artificial Intelligence

2407.16181

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New Jersey (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Directed Regular and Context-Free Languages

Ganardi, Moses, Saglam, Irmak, Zetzsche, Georg

arXiv.org Artificial IntelligenceJan-18-2024

We study the problem of deciding whether a given language is directed. A language $L$ is \emph{directed} if every pair of words in $L$ have a common (scattered) superword in $L$. Deciding directedness is a fundamental problem in connection with ideal decompositions of downward closed sets. Another motivation is that deciding whether two \emph{directed} context-free languages have the same downward closures can be decided in polynomial time, whereas for general context-free languages, this problem is known to be coNEXP-complete. We show that the directedness problem for regular languages, given as NFAs, belongs to $AC^1$, and thus polynomial time. Moreover, it is NL-complete for fixed alphabet sizes. Furthermore, we show that for context-free languages, the directedness problem is PSPACE-complete.

atom, ideal representation, representation, (16 more...)

arXiv.org Artificial Intelligence

2401.07106

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Lazio > Rome (0.04)
Europe > Germany > Saarland > Saarbrücken (0.04)
(16 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.67)

Add feedback

Greedy Grammar Induction with Indirect Negative Evidence

Potashnik, Joseph

arXiv.org Artificial IntelligenceDec-23-2023

This paper offers a fresh look at the pumping lemma constant as an upper bound for the finite structural information of a Context Free Grammar. An objective function based on indirect negative evidence considers the occurrences, and non-occurrences, of a finite number of trees, encountered after a sufficiently long non-adversial input presentation. This objective function has optimal substructure in the hypotheses space, giving rise to a greedy search learner. With this learner, a range of classes of Context Free Languages is shown to be learnable (identifiable in the limit) on an otherwise intractable hypotheses space.

fitness function, grammar, learner, (14 more...)

arXiv.org Artificial Intelligence

2312.15321

Country:

Europe > Spain > Valencian Community > Alicante Province > Alicante (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

An Exploration of Left-Corner Transformations

Opedal, Andreas, Tsipidi, Eleftheria, Pimentel, Tiago, Cotterell, Ryan, Vieira, Tim

arXiv.org Artificial IntelligenceNov-27-2023

The left-corner transformation (Rosenkrantz and Lewis, 1970) is used to remove left recursion from context-free grammars, which is an important step towards making the grammar parsable top-down with simple techniques. This paper generalizes prior left-corner transformations to support semiring-weighted production rules and to provide finer-grained control over which left corners may be moved. Our generalized left-corner transformation (GLCT) arose from unifying the left-corner transformation and speculation transformation (Eisner and Blatz, 2007), originally for logic programming. Our new transformation and speculation define equivalent weighted languages. Yet, their derivation trees are structurally different in an important way: GLCT replaces left recursion with right recursion, and speculation does not. We also provide several technical results regarding the formal relationships between the outputs of GLCT, speculation, and the original grammar. Lastly, we empirically investigate the efficiency of GLCT for left-recursion elimination from grammars of nine languages.

grammar, nonterminal, transformation, (16 more...)

arXiv.org Artificial Intelligence

2311.16258

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New Jersey (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Efficient Algorithms for Recognizing Weighted Tree-Adjoining Languages

Butoi, Alexandra, Vieira, Tim, Cotterell, Ryan, Chiang, David

arXiv.org Artificial IntelligenceOct-23-2023

The class of tree-adjoining languages can be characterized by various two-level formalisms, consisting of a context-free grammar (CFG) or pushdown automaton (PDA) controlling another CFG or PDA. These four formalisms are equivalent to tree-adjoining grammars (TAG), linear indexed grammars (LIG), pushdown-adjoining automata (PAA), and embedded pushdown automata (EPDA). We define semiring-weighted versions of the above two-level formalisms, and we design new algorithms for computing their stringsums (the weight of all derivations of a string) and allsums (the weight of all derivations). From these, we also immediately obtain stringsum and allsum algorithms for TAG, LIG, PAA, and EPDA. For LIG, our algorithm is more time-efficient by a factor of $\mathcal{O}(n|\mathcal{N}|)$ (where $n$ is the string length and $|\mathcal{N}|$ is the size of the nonterminal set) and more space-efficient by a factor of $\mathcal{O}(|\Gamma|)$ (where $|\Gamma|$ is the size of the stack alphabet) than the algorithm of Vijay-Shanker and Weir (1989). For EPDA, our algorithm is both more space-efficient and time-efficient than the algorithm of Alonso et al. (2001) by factors of $\mathcal{O}(|\Gamma|^2)$ and $\mathcal{O}(|\Gamma|^3)$, respectively. Finally, we give the first PAA stringsum and allsum algorithms.

algorithm, normal form, pop computation, (16 more...)

arXiv.org Artificial Intelligence

2310.15276

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > Pennsylvania (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback