AITopics | Grammars & Parsing

Collaborating Authors

Grammars & Parsing

News Overviews Instructional Materials AI-Alerts Classics

Analyzing the Effect of Linguistic Similarity on Cross-Lingual Transfer: Tasks and Experimental Setups Matter

Blaschke, Verena, Fedzechkina, Masha, ter Hoeve, Maartje

arXiv.org Artificial IntelligenceJan-24-2025

Cross-lingual transfer is a popular approach to increase the amount of training data for NLP tasks in a low-resource context. However, the best strategy to decide which cross-lingual data to include is unclear. Prior research often focuses on a small set of languages from a few language families and/or a single task. It is still an open question how these findings extend to a wider variety of languages and tasks. In this work, we analyze cross-lingual transfer for 266 languages from a wide variety of language families. Moreover, we include three popular NLP tasks: POS tagging, dependency parsing, and topic classification. Our findings indicate that the effect of linguistic similarity on transfer performance depends on a range of factors: the NLP task, the (mono- or multilingual) input representations, and the definition of linguistic similarity.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.14491

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.14)
(26 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Review for NeurIPS paper: Compositional Generalization via Neural-Symbolic Stack Machines

Neural Information Processing SystemsJan-21-2025, 20:19:07 GMT

The paper proposes a new method for compositional generalization in sequence-to-sequence tasks. The basic idea is to have a symbolic stack machine (capable of compositionally manipulating sequences) that is controlled by a neural network. The method gets perfect accuracy on an existing compositional generalization dataset, a small-scale English-French machine translation task, and a grammar parsing task. Pros: Novel architecture Attractive way of providing inductive bias without hardcoding too much knowledge The paper is well-written Strong experimental results in the domains considered Cons: The paper could do more by the way of providing insights about why the model works. The reviewers appreciated the clarifications provided in the author feedback.

artificial intelligence, natural language, neural-symbolic stack machine, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.67)

Add feedback

Reviews: Learning to Exploit Stability for 3D Scene Parsing

Neural Information Processing SystemsJan-20-2025, 04:01:30 GMT

The goal of this paper is to output a set of 3D bounding boxes and set of dominant planes for a scene depicted in a single image. The key insight is to incorporate stability constraints in the 3D layout, i.e., the reconstructed 3D boxes should not move too far under simulation (in Bullet) with physical forces (gravity, friction). Parameters for 3D boxes are regressed using a modified R-CNN training loss and dominant planes for the walls and floors are regressed via a RNN. A stability criterion is used to update the output 3D scene (via REINFORCE) where the predicted 3D layout is run through Bullet simulator and 3D displacements are checked. Results are shown on synthetic (SUNCG, SceneNet RGB-D) and real (SUN RGB-D) datasets, out-performing the factored 3D approach of [Tulsiani18].

exploit stability, layout, scene parsing, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.53)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.40)

Add feedback

BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing

Neural Information Processing SystemsJan-19-2025, 16:46:47 GMT

Recent work has shown that generation from a prompted or fine-tuned language model can perform well at semantic parsing when the output is constrained to be a valid semantic representation. We introduce BenchCLAMP, a Benchmark to evaluate Constrained LAnguage Model Parsing, that includes context-free grammars for seven semantic parsing datasets and two syntactic parsing datasets with varied output meaning representations, as well as a constrained decoding interface to generate only valid outputs covered by these grammars. We provide low, medium, and high resource splits for each dataset, allowing accurate comparison of various language models under different data regimes. Our benchmark supports evaluation of language models using prompt-based learning as well as fine-tuning. Our experiments show that encoder-decoder pretrained language models can achieve similar performance or even surpass state-of-the-art methods for both syntactic and semantic parsing when the model output is constrained to be valid.

benchclamp, language model, syntactic and semantic parsing, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective

Neural Information Processing SystemsJan-19-2025, 10:51:31 GMT

We focus on the weakly-supervised audio-visual video parsing task (AVVP), which aims to identify and locate all the events in audio/visual modalities. Previous works only concentrate on video-level overall label denoising across modalities, but overlook the segment-level label noise, where adjacent video segments (i.e., 1-second video clips) may contain different events. However, recognizing events on the segment is challenging because its label could be any combination of events that occur in the video. To address this issue, we consider tackling AVVP from the language perspective, since language could freely describe how various events appear in each segment beyond fixed labels. Specifically, we design language prompts to describe all cases of event appearance for each video.

language perspective, language prompt, revisit weakly-supervised audio-visual video parsing, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.65)

Add feedback

Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing

Neural Information Processing SystemsJan-19-2025, 02:51:12 GMT

The audio-visual video parsing task aims to parse a video into modality- and category-aware temporal segments. Previous work mainly focuses on weakly-supervised approaches, which learn from video-level event labels. During training, they do not know which modality perceives and meanwhile which temporal segment contains the video event. Since there is no explicit grouping in the existing frameworks, the modality and temporal uncertainties make these methods suffer from false predictions. For instance, segments in the same category could be predicted in different event classes.

multi-modal grouping network, temporal segment, weakly-supervised audio-visual video parsing, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.69)

Add feedback

QATCH: Benchmarking SQL-centric tasks with Table Representation Learning Models on Your Data

Neural Information Processing SystemsJan-18-2025, 20:21:19 GMT

Table Representation Learning (TRL) models are commonly pre-trained on large open-domain datasets comprising millions of tables and then used to address downstream tasks. Choosing the right TRL model to use on proprietary data can be challenging, as the best results depend on the content domain, schema, and data quality. Our purpose is to support end-users in testing TRL models on proprietary data in two established SQL-centric tasks, i.e., Question Answering (QA) and Semantic Parsing (SP). We present QATCH (Query-Aided TRL Checklist), a toolbox to highlight TRL models' strengths and weaknesses on relational tables unseen at training time. For an input table, QATCH automatically generates a testing checklist tailored to QA and SP.

benchmarking sql-centric task, table representation learning model, trl model, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.63)

Add feedback

MOMA: Multi-Object Multi-Actor Activity Parsing

Neural Information Processing SystemsJan-17-2025, 17:14:07 GMT

Complex activities often involve multiple humans utilizing different objects to complete actions (e.g., in healthcare settings, physicians, nurses, and patients interact with each other and various medical devices). Recognizing activities poses a challenge that requires a detailed understanding of actors' roles, objects' affordances, and their associated relationships. Furthermore, these purposeful activities are composed of multiple achievable steps, including sub-activities and atomic actions, which jointly define a hierarchy of action parts. This paper introduces Activity Parsing as the overarching task of temporal segmentation and classification of activities, sub-activities, atomic actions, along with an instance-level understanding of actors, objects, and their relationships in videos. Involving multiple entities (actors and objects), we argue that traditional pair-wise relationships, often used in scene or action graphs, do not appropriately represent the dynamics between them.

activity parsing, moma, multi-object multi-actor activity parsing, (3 more...)

Neural Information Processing Systems

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.73)

Add feedback

Natural Language Processing of Privacy Policies: A Survey

Adhikari, Andrick, Das, Sanchari, Dewri, Rinku

arXiv.org Artificial IntelligenceJan-17-2025

Natural Language Processing (NLP) is an essential subset of artificial intelligence. It has become effective in several domains, such as healthcare, finance, and media, to identify perceptions, opinions, and misuse, among others. Privacy is no exception, and initiatives have been taken to address the challenges of usable privacy notifications to users with the help of NLP. To this aid, we conduct a literature review by analyzing 109 papers at the intersection of NLP and privacy policies. First, we provide a brief introduction to privacy policies and discuss various facets of associated problems, which necessitate the application of NLP to elevate the current state of privacy notices and disclosures to users. Subsequently, we a) provide an overview of the implementation and effectiveness of NLP approaches for better privacy policy communication; b) identify the methodologies that can be further enhanced to provide robust privacy policies; and c) identify the gaps in the current state-of-the-art research. Our systematic analysis reveals that several research papers focus on annotating and classifying privacy texts for analysis but need to adequately dwell on other aspects of NLP applications, such as summarization. More specifically, ample research opportunities exist in this domain, covering aspects such as corpus generation, summarization vectors, contextualized word embedding, identification of privacy-relevant statement categories, fine-grained classification, and domain-specific model tuning.

information retrieval, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2501.10319

Country:

North America > United States > Colorado > Denver County > Denver (0.04)
Oceania > Palau (0.04)
North America > United States > Virginia > Fairfax County > Fairfax (0.04)
(6 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(5 more...)

Add feedback

BBPOS: BERT-based Part-of-Speech Tagging for Uzbek

Bobojonova, Latofat, Akhundjanova, Arofat, Ostheimer, Phil, Fellenz, Sophie

arXiv.org Artificial IntelligenceJan-17-2025

This paper advances NLP research for the low-resource Uzbek language by evaluating two previously untested monolingual Uzbek BERT models on the part-of-speech (POS) tagging task and introducing the first publicly available UPOS-tagged benchmark dataset for Uzbek. Our fine-tuned models achieve 91% average accuracy, outperforming the baseline multi-lingual BERT as well as the rule-based tagger. Notably, these models capture intermediate POS changes through affixes and demonstrate context sensitivity, unlike existing rule-based taggers.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.10107

Country:

Europe (0.47)
Asia (0.28)
North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback