AITopics | Grammars & Parsing

Collaborating Authors

Grammars & Parsing

News Overviews Instructional Materials AI-Alerts Classics

Imitation of Life: A Search Engine for Biologically Inspired Design

Emuna, Hen, Borenstein, Nadav, Qian, Xin, Kang, Hyeonsu, Chan, Joel, Kittur, Aniket, Shahaf, Dafna

arXiv.org Artificial IntelligenceDec-19-2023

Biologically Inspired Design (BID), or Biomimicry, is a problem-solving methodology that applies analogies from nature to solve engineering challenges. For example, Speedo engineers designed swimsuits based on shark skin. Finding relevant biological solutions for real-world problems poses significant challenges, both due to the limited biological knowledge engineers and designers typically possess and to the limited BID resources. Existing BID datasets are hand-curated and small, and scaling them up requires costly human annotations. In this paper, we introduce BARcode (Biological Analogy Retriever), a search engine for automatically mining bio-inspirations from the web at scale. Using advances in natural language understanding and data programming, BARcode identifies potential inspirations for engineering challenges. Our experiments demonstrate that BARcode can retrieve inspirations that are valuable to engineers and designers tackling real-world problems, as well as recover famous historical BID examples. We release data and code; we view BARcode as a step towards addressing the challenges that have historically hindered the practical application of BID to engineering innovation.

bar code, query, right attr, (16 more...)

arXiv.org Artificial Intelligence

2312.12681

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > New York > New York County > New York City (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.70)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.61)
(2 more...)

Add feedback

Dynamic Syntax Mapping: A New Approach to Unsupervised Syntax Parsing

Gohsh, Buvarp, Ali, Woods, Michael, Anders

arXiv.org Artificial IntelligenceDec-18-2023

The advancement of large pretrained language models (LLMs) such as BERT [1] has been a catalyst in enhancing performance across a spectrum of natural language processing (NLP) tasks. This progress has spurred investigations into the mechanics of natural language comprehension within these models, particularly concerning their linguistic capabilities. Building on this foundation, our study tests the hypothesis that LLMs inherently contain syntactic dependencies that can be extracted without the need for additional parameters or external guidance. Existing literature in the realm of syntax has delved into two key areas: (1) the examination of syntactically dependent behaviors in language models, such as their ability to handle complex subject-verb agreement scenarios [2, 3, 4, 5, 6, 7, 8], and (2) the feasibility of retrieving syntactic structures from internal representations or mechanisms within the models [11, 12, 13]. The former approach, while agnostic to specific syntactic theories, lacks the clarity offered by the explicit induction of syntactic structures, which is the focus of our work. In an exemplary study, Hewitt and Manning [11] developed a probe to map word representations into a new vector space, facilitating the application of a maximum spanning tree algorithm (MST) for syntactic parsing.

computational linguistic, dynamic syntax mapping, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2312.14966

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(9 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest Neighbor In-Context Learning

Zhao, Wenting, Liu, Ye, Wan, Yao, Wang, Yibo, Wu, Qingyang, Deng, Zhongfen, Du, Jiangshu, Liu, Shuaiqi, Xu, Yunlong, Yu, Philip S.

arXiv.org Artificial IntelligenceDec-17-2023

Task-Oriented Parsing (TOP) enables conversational assistants to interpret user commands expressed in natural language, transforming them into structured outputs that combine elements of both natural language and intent/slot tags. Recently, Large Language Models (LLMs) have achieved impressive performance in synthesizing computer programs based on a natural language prompt, mitigating the gap between natural language and structured programs. Our paper focuses on harnessing the capabilities of LLMs for semantic parsing tasks, addressing the following three key research questions: 1) How can LLMs be effectively utilized for semantic parsing tasks? 2) What defines an effective prompt? and 3) How can LLM overcome the length constraint and streamline prompt design by including all examples as prompts? We introduce k Nearest Neighbor In-Context Learning(kNN-ICL), which simplifies prompt engineering by allowing it to be built on top of any design strategy while providing access to all demo examples. Extensive experiments show that: 1)Simple ICL without kNN search can achieve a comparable performance with strong supervised models on the TOP tasks, and 2) kNN-ICL significantly improves the comprehension of complex requests by seamlessly integrating ICL with a nearest-neighbor approach. Notably, this enhancement is achieved without the need for additional data or specialized prompts.

arxiv preprint arxiv, language model, llm, (14 more...)

arXiv.org Artificial Intelligence

2312.10771

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York > Broome County > Binghamton (0.04)
(9 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

From Dialogue to Diagram: Task and Relationship Extraction from Natural Language for Accelerated Business Process Prototyping

Qayyum, Sara, Asghar, Muhammad Moiz, Yaseen, Muhammad Fouzan

arXiv.org Artificial IntelligenceDec-16-2023

The automatic transformation of verbose, natural language descriptions into structured process models remains a challenge of significant complexity - This paper introduces a contemporary solution, where central to our approach, is the use of dependency parsing and Named Entity Recognition (NER) for extracting key elements from textual descriptions. Additionally, we utilize Subject-Verb-Object (SVO) constructs for identifying action relationships and integrate semantic analysis tools, including WordNet, for enriched contextual understanding. A novel aspect of our system is the application of neural coreference resolution, integrated with the SpaCy framework, enhancing the precision of entity linkage and anaphoric references. Furthermore, the system adeptly handles data transformation and visualization, converting extracted information into BPMN (Business Process Model and Notation) diagrams. This methodology not only streamlines the process of capturing and representing business workflows but also significantly reduces the manual effort and potential for error inherent in traditional modeling approaches.

business process, modeling, process description, (14 more...)

arXiv.org Artificial Intelligence

2312.10432

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Review of Unsupervised POS Tagging and Its Implications on Language Acquisition

Dickson, Niels

arXiv.org Artificial IntelligenceDec-15-2023

An ability that underlies human syntactic knowledge is determining which words can appear in the similar structures (i.e. grouping words by their syntactic categories). These groupings enable humans to combine structures in order to communicate complex meanings. A foundational question is how do children acquire this ability underlying syntactic knowledge. In exploring this process, we will review various engineering approaches whose goal is similar to that of a child's -- without prior syntactic knowledge, correctly identify the parts of speech (POS) of the words in a sample of text. In reviewing these unsupervised tagging efforts, we will discuss common themes that support the advances in the models and their relevance for language acquisition. For example, we discuss how each model judges success (evaluation metrics), the "additional information" that constrains the POS learning (such as orthographic information), and the context used to determine POS (only previous word, words before and after the target, etc). The identified themes pave the way for future investigations into the cognitive processes that underpin the acquisition of syntactic categories and provide a useful layout of current state of the art unsupervised POS tagging models.

category, information, syntactic category, (15 more...)

arXiv.org Artificial Intelligence

2312.10169

Country:

North America > United States > Massachusetts > Middlesex County > Somerville (0.04)
North America > United States > New Jersey > Bergen County > Mahwah (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(3 more...)

Add feedback

Linear-Time Modeling of Linguistic Structure: An Order-Theoretic Perspective

Liu, Tianyu, Amini, Afra, Sachan, Mrinmaya, Cotterell, Ryan

arXiv.org Artificial IntelligenceDec-12-2023

Tasks that model the relation between pairs of tokens in a string are a vital part of understanding natural language. Such tasks, in general, require exhaustive pair-wise comparisons of tokens, thus having a quadratic runtime complexity in the length of the string. We show that these exhaustive comparisons can be avoided, and, moreover, the complexity of such tasks can be reduced to linear by casting the relation between tokens as a partial order over the string. Our method predicts real numbers for each token in a string in parallel and sorts the tokens accordingly, resulting in total orders of the tokens in the string. Each total order implies a set of arcs oriented from smaller to greater tokens, sorted by their predicted numbers. The intersection of total orders results in a partial order over the set of tokens in the string, which is then decoded into a directed graph representing the desired linguistic structure. Our experiments on dependency parsing and coreference resolution show that our method achieves state-of-the-art or comparable performance. Moreover, the linear complexity and parallelism of our method double the speed of graph-based coreference resolution models, and bring a 10-times speed-up over graph-based dependency parsers.

computational linguistic, dependency, linguistic, (16 more...)

arXiv.org Artificial Intelligence

2305.15057

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(24 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Proceedings of the 2023 XCSP3 Competition

Audemard, Gilles, Lecoutre, Christophe, Lonca, Emmanuel

arXiv.org Artificial IntelligenceDec-10-2023

This short paper gives an overview of the XCSP3 solver implemented in Picat. Picat provides several constraint modules, and the Picat XCSP3 solver uses the sat module. The XCSP3 solver mainly consists of a parser implemented in Picat, which converts constraints from XCSP3 format to Picat. The solver demonstrates the strengths of Picat, a logic-based language, in parsing, modeling, and encoding constraints into SAT. The solver submitted to the 2022 XCSP competition is based on the one that won the 2019 XCSP competition.

competition, constraint, vararray, (16 more...)

arXiv.org Artificial Intelligence

2312.05877

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(30 more...)

Genre:

Research Report (1.00)
Overview (0.74)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
(4 more...)

Add feedback

Proceedings of the 2022 XCSP3 Competition

Audemard, Gilles, Lecoutre, Christophe, Lonca, Emmanuel

arXiv.org Artificial IntelligenceDec-10-2023

competition, constraint, vararray, (16 more...)

arXiv.org Artificial Intelligence

2209.00917

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
(28 more...)

Genre: Research Report (1.00)

Industry: Government > Military (0.46)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
(5 more...)

Add feedback

JADE: A Linguistics-based Safety Evaluation Platform for Large Language Models

Zhang, Mi, Pan, Xudong, Yang, Min

arXiv.org Artificial IntelligenceDec-10-2023

In this paper, we present JADE, a targeted linguistic fuzzing platform which strengthens the linguistic complexity of seed questions to simultaneously and consistently break a wide range of widely-used LLMs categorized in three groups: eight open-sourced Chinese, six commercial Chinese and four commercial English LLMs. JADE generates three safety benchmarks for the three groups of LLMs, which contain unsafe questions that are highly threatening: the questions simultaneously trigger harmful generation of multiple LLMs, with an average unsafe generation ratio of $70\%$ (please see the table below), while are still natural questions, fluent and preserving the core unsafe semantics. We release the benchmark demos generated for commercial English LLMs and open-sourced English LLMs in the following link: https://github.com/whitzard-ai/jade-db. For readers who are interested in evaluating on more questions generated by JADE, please contact us. JADE is based on Noam Chomsky's seminal theory of transformational-generative grammar. Given a seed question with unsafe intention, JADE invokes a sequence of generative and transformational rules to increment the complexity of the syntactic structure of the original question, until the safety guardrail is broken. Our key insight is: Due to the complexity of human language, most of the current best LLMs can hardly recognize the invariant evil from the infinite number of different syntactic structures which form an unbound example space that can never be fully covered. Technically, the generative/transformative rules are constructed by native speakers of the languages, and, once developed, can be used to automatically grow and transform the parse tree of a given question, until the guardrail is broken. For more evaluation results and demo, please check our website: https://whitzard-ai.github.io/jade.html.

jade, linguistics-based safety evaluation platform, llm, (14 more...)

arXiv.org Artificial Intelligence

2311.00286

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Augmenty: A Python Library for Structured Text Augmentation

Enevoldsen, Kenneth

arXiv.org Artificial IntelligenceDec-9-2023

Text augmentation is useful for tool for training (Wei and Zou 2019) and evaluating (Ribeiro et al. 2020) natural language processing models and systems. Despite its utility existing libraries for text augmentation often exhibit limitations in terms of functionality and flexibility, being confined to basic tasks such as text-classification or cater to specific downstream use-cases such as estimating robustness (Goel et al. 2021). Recognizing these constraints, Augmenty is a tool for structured text augmentation of the text along with its annotations. Augmenty integrates seamlessly with the popular NLP library spaCy (Honnibal et al. 2020) and seeks to be compatible with all models and tasks supported by spaCy. Augmenty provides a wide range of augmenters which can be combined in a flexible manner to create complex augmentation pipelines. It also includes a set of primitives that can be used to create custom augmenters such as word replacement augmenters. This functionality allows for augmentations within a range of applications such as named entity recognition (NER), part-of-speech tagging, and dependency parsing.

annotation, augmentation, augmenty, (10 more...)

arXiv.org Artificial Intelligence

2312.0552

Country:

North America > United States > New Jersey (0.05)
Europe > Faroe Islands > Streymoy > Tórshavn (0.05)
Europe > Estonia > Tartu County > Tartu (0.05)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.71)

Add feedback