AITopics | Grammars & Parsing

Collaborating Authors

Grammars & Parsing

News Overviews Instructional Materials AI-Alerts Classics

Reviews: Predicting Scene Parsing and Motion Dynamics in the Future

Neural Information Processing SystemsOct-8-2024, 00:52:03 GMT

The paper proposes a deep-learning-based approach to joint prediction of future optical flow and semantic segmentation in videos. The authors evaluate the approach in a driving scenario and show that the two components - flow prediction and semantic segmentation prediction - benefit from each other. The paper is related to works of Jin et al. and Neverova et al. However, as far as I understand, both of these have not been officially published at the time of submission (and the work of Neverova et al. Detailed comment: Pros: 1) The idea seems sound: predicting segmentation and optical flow are both important tasks, and they should be mutually beneficial.

prediction, scene parsing and motion dynamic, segmentation, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.40)

Add feedback

Parameter Choice and Neuro-Symbolic Approaches for Deep Domain-Invariant Learning

Dinu, Marius-Constantin

arXiv.org Artificial IntelligenceOct-8-2024

As artificial intelligence (AI) systems advance, we move towards broad AI: systems capable of performing well on diverse tasks, understanding context, and adapting rapidly to new scenarios. A central challenge for broad AI systems is to generalize over tasks in related domains and being robust to distribution shifts. Neuro-symbolic (NeSy) AI bridges the gap between symbolic and sub-symbolic paradigms to address these challenges, enabling adaptable, generalizable, and more interpretable systems. The development of broad AI requires advancements in domain adaptation (DA), enabling models trained on source domains to effectively generalize to unseen target domains. Traditional approaches often rely on parameter optimization and fine-tuning, which can be impractical due to high costs and risks of catastrophic forgetting. NeSy AI systems use multiple models and methods to generalize to unseen domains and maintain performance across varying conditions. We analyze common DA and NeSy approaches with a focus on deep domain-invariant learning, extending to real-world challenges such as adapting to continuously changing domains and handling large domain gaps. We showcase state-of-the-art model-selection methods for scenarios with limited samples and introduce domain-specific adaptations without gradient-based updates for cases where model tuning is infeasible. This work establishes a framework for scalable and generalizable broad AI systems applicable across various problem settings, demonstrating how symbolic reasoning and large language models can build universal computational graphs that generalize across domains and problems, contributing to more adaptable AI approaches for real-world applications.

activation dropout fully-connected layer 128, generalist foundation model outcompete special-purpose, neural network meet neural-symbolic computing, (15 more...)

arXiv.org Artificial Intelligence

2410.06235

Country:

Europe > Austria > Vienna (0.13)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(9 more...)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Information Technology (1.00)
Automobiles & Trucks (0.92)
Health & Medicine > Therapeutic Area > Neurology (0.92)
(3 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
(12 more...)

Add feedback

Reviews: Learning Pipelines with Limited Data and Domain Knowledge: A Study in Parsing Physics Problems

Neural Information Processing SystemsOct-7-2024, 23:45:38 GMT

The main idea is the use of PSL (probabilistic soft logic) as a framework to map partial estimates from multiple feedforward algorithms, along with domain specific logical rules, to parse visual diagrams from physics texts. Specifically, the pipelines use feature extractors for lines, arcs, corners, text elements, object elements (e.g.blocks in physics diagrams). These are combined along with human specified rules for groupings, high-level elements, text/figure labeling schemes along with the inference engine to produce the parse into a formal logical language. Experiments illustrate how the learned system: 1) is superior to state of the art diagram parsing scheme, 2) can utilize labelled as well as unlabelled data to achieve improved performance, 3) can handle various degrees of supervision in different parts of the pipeline and is robust, and 4) through integrative modeling of the stages in pipeline prevents error propagation. Quality, Clarity, originality, significance of the paper: The paper is well written and has extensive references to relevant literature, adequate experimentation.

data and domain knowledge, diagram, parsing physics problem, (9 more...)

Neural Information Processing Systems

Genre: Summary/Review (0.36)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.64)

Add feedback

Progressive distillation induces an implicit curriculum

Panigrahi, Abhishek, Liu, Bingbin, Malladi, Sadhika, Risteski, Andrej, Goel, Surbhi

arXiv.org Artificial IntelligenceOct-7-2024

Knowledge distillation leverages a teacher model to improve the training of a student model. A persistent challenge is that a better teacher does not always yield a better student, to which a common mitigation is to use additional supervision from several ``intermediate'' teachers. One empirically validated variant of this principle is progressive distillation, where the student learns from successive intermediate checkpoints of the teacher. Using sparse parity as a sandbox, we identify an implicit curriculum as one mechanism through which progressive distillation accelerates the student's learning. This curriculum is available only through the intermediate checkpoints but not the final converged one, and imparts both empirical acceleration and a provable sample complexity benefit to the student. We then extend our investigation to Transformers trained on probabilistic context-free grammars (PCFGs) and real-world pre-training datasets (Wikipedia and Books). Through probing the teacher model, we identify an analogous implicit curriculum where the model progressively learns features that capture longer context. Our theoretical and empirical findings on sparse parity, complemented by empirical observations on more complex tasks, highlight the benefit of progressive distillation via implicit curriculum across setups.

checkpoint, distillation, progressive distillation, (14 more...)

arXiv.org Artificial Intelligence

2410.05464

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(7 more...)

Genre: Research Report > New Finding (0.67)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(3 more...)

Add feedback

Leveraging Grammar Induction for Language Understanding and Generation

Kai, Jushi, Hou, Shengyuan, Huang, Yusheng, Lin, Zhouhan

arXiv.org Artificial IntelligenceOct-7-2024

Grammar induction has made significant progress in recent years. However, it is not clear how the application of induced grammar could enhance practical performance in downstream tasks. In this work, we introduce an unsupervised grammar induction method for language understanding and generation. We construct a grammar parser to induce constituency structures and dependency relations, which is simultaneously trained on downstream tasks without additional syntax annotations. The induced grammar features are subsequently incorporated into Transformer as a syntactic mask to guide self-attention. We evaluate and apply our method to multiple machine translation tasks and natural language understanding tasks. Our method demonstrates superior performance compared to the original Transformer and other models enhanced with external parsers. Experimental results indicate that our method is effective in both from-scratch and pre-trained scenarios. Additionally, our research highlights the contribution of explicitly modeling the grammatical structure of texts to neural network models.

computational linguistic, parser, transformer, (15 more...)

arXiv.org Artificial Intelligence

2410.04878

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
(13 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

On Eliciting Syntax from Language Models via Hashing

Wang, Yiran, Utiyama, Masao

arXiv.org Artificial IntelligenceOct-5-2024

Unsupervised parsing, also known as grammar induction, aims to infer syntactic structure from raw text. Recently, binary representation has exhibited remarkable information-preserving capabilities at both lexicon and syntax levels. In this paper, we explore the possibility of leveraging this capability to deduce parsing trees from raw text, relying solely on the implicitly induced grammars within models. To achieve this, we upgrade the bit-level CKY from zero-order to first-order to encode the lexicon and syntax in a unified binary representation space, switch training from supervised to unsupervised under the contrastive hashing framework, and introduce a novel loss function to impose stronger yet balanced alignment signals. Our model shows competitive performance on various datasets, therefore, we claim that our method is effective and efficient enough to acquire high-quality parsing trees from pre-trained language models at a low cost.

computational linguistic, proceedings, span, (13 more...)

arXiv.org Artificial Intelligence

2410.04074

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(17 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On Uncertainty In Natural Language Processing

Ulmer, Dennis

arXiv.org Artificial IntelligenceOct-4-2024

The last decade in deep learning has brought on increasingly capable systems that are deployed on a wide variety of applications. In natural language processing, the field has been transformed by a number of breakthroughs including large language models, which are used in increasingly many user-facing applications. In order to reap the benefits of this technology and reduce potential harms, it is important to quantify the reliability of model predictions and the uncertainties that shroud their development. This thesis studies how uncertainty in natural language processing can be characterized from a linguistic, statistical and neural perspective, and how it can be reduced and quantified through the design of the experimental pipeline. We further explore uncertainty quantification in modeling by theoretically and empirically investigating the effect of inductive model biases in text classification tasks. The corresponding experiments include data for three different languages (Danish, English and Finnish) and tasks as well as a large set of different uncertainty quantification approaches. Additionally, we propose a method for calibrated sampling in natural language generation based on non-exchangeable conformal prediction, which provides tighter token sets with better coverage of the actual continuation. Lastly, we develop an approach to quantify confidence in large black-box language models using auxiliary predictors, where the confidence is predicted from the input to and generated output text of the target model alone.

artificial intelligence, chatbot, large language model, (24 more...)

arXiv.org Artificial Intelligence

2410.03446

Country:

North America > United States > California > Los Angeles County > Long Beach (0.14)
North America > United States > New York > New York County > New York City (0.13)
Europe > Denmark > Capital Region > Copenhagen (0.13)
(71 more...)

Genre:

Summary/Review (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(3 more...)

Industry:

Transportation (1.00)
Law (1.00)
Information Technology (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(11 more...)

Add feedback

Should Cross-Lingual AMR Parsing go Meta? An Empirical Assessment of Meta-Learning and Joint Learning AMR Parsing

Kang, Jeongwoo, Coavoux, Maximin, Lopez, Cédric, Schwab, Didier

arXiv.org Artificial IntelligenceOct-4-2024

Cross-lingual AMR parsing is the task of predicting AMR graphs in a target language when training data is available only in a source language. Due to the small size of AMR training data and evaluation data, cross-lingual AMR parsing has only been explored in a small set of languages such as English, Spanish, German, Chinese, and Italian. Taking inspiration from Langedijk et al. (2022), who apply meta-learning to tackle cross-lingual syntactic parsing, we investigate the use of meta-learning for cross-lingual AMR parsing. We evaluate our models in $k$-shot scenarios (including 0-shot) and assess their effectiveness in Croatian, Farsi, Korean, Chinese, and French. Notably, Korean and Croatian test sets are developed as part of our work, based on the existing The Little Prince English AMR corpus, and made publicly available. We empirically study our method by comparing it to classical joint learning. Our findings suggest that while the meta-learning model performs slightly better in 0-shot evaluation for certain languages, the performance gain is minimal or absent when $k$ is higher than 0.

computational linguistic, proceedings, training data, (15 more...)

arXiv.org Artificial Intelligence

2410.03357

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
Europe > Bulgaria > Sofia City Province > Sofia (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Online Dynamic Programming

Holakou Rahmanian, Manfred K. K. Warmuth

Neural Information Processing SystemsOct-3-2024, 17:11:09 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, dynamic programming problem, polytope, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Russia (0.04)
(2 more...)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.55)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.46)

Add feedback

IndicSentEval: How Effectively do Multilingual Transformer Models encode Linguistic Properties for Indic Languages?

Aravapalli, Akhilesh, Marreddy, Mounika, Oota, Subba Reddy, Mamidi, Radhika, Gupta, Manish

arXiv.org Artificial IntelligenceOct-3-2024

Transformer-based models have revolutionized the field of natural language processing. To understand why they perform so well and to assess their reliability, several studies have focused on questions such as: Which linguistic properties are encoded by these models, and to what extent? How robust are these models in encoding linguistic properties when faced with perturbations in the input text? However, these studies have mainly focused on BERT and the English language. In this paper, we investigate similar questions regarding encoding capability and robustness for 8 linguistic properties across 13 different perturbations in 6 Indic languages, using 9 multilingual Transformer models (7 universal and 2 Indic-specific). To conduct this study, we introduce a novel multilingual benchmark dataset, IndicSentEval, containing approximately $\sim$47K sentences. Surprisingly, our probing analysis of surface, syntactic, and semantic properties reveals that while almost all multilingual models demonstrate consistent encoding performance for English, they show mixed results for Indic languages. As expected, Indic-specific multilingual models capture linguistic properties in Indic languages better than universal models. Intriguingly, universal models broadly exhibit better robustness compared to Indic-specific models, particularly under perturbations such as dropping both nouns and verbs, dropping only verbs, or keeping only nouns. Overall, this study provides valuable insights into probing and perturbation-specific strengths and weaknesses of popular multilingual Transformer-based models for different Indic languages. We make our code and dataset publicly available [https://tinyurl.com/IndicSentEval}].

indic language, multilingual model, perturbation, (16 more...)

arXiv.org Artificial Intelligence

2410.02611

Country:

Asia > India > Telangana > Hyderabad (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.67)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)

Add feedback