AITopics | Grammars & Parsing

Collaborating Authors

Grammars & Parsing

News Overviews Instructional Materials AI-Alerts Classics

A Preliminary Study for Literary Rhyme Generation based on Neuronal Representation, Semantics and Shallow Parsing

Moreno-Jiménez, Luis-Gil, Torres-Moreno, Juan-Manuel, Wedemann, Roseli S.

arXiv.org Artificial IntelligenceDec-25-2021

For many years, research in Artificial Intelligence (AI) has directed efforts towards automating processes to perform specific academic, industrial or economic tasks for society. However, the investigation and development of procedures for the automation of human artistic and creative processes has not had as much attention due to the complexities involved in these activities. Procedures developed for these purposes involve mathematical-computational methods designed to process and learn from a large quantity of digital data, so as to detect patterns in order to simulate the creative process (CP), as explained by Boden in [3]. In this paper, we introduce a model for the generation of rhymes with literary components. Our proposal is based on findings detailed in [11], where Automatic Text Generation (ATG) techniques are combined with neural network (NN) based models, such as the Word2vec algorithm [9], for the generation of literary texts.

megalite-es corpus, relation, rhyme, (13 more...)

arXiv.org Artificial Intelligence

2112.13241

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > Mexico (0.05)
Europe > France (0.05)
(11 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.66)

Add feedback

Machine Learning and the Challenge of Predicting Fake News

#artificialintelligenceDec-19-2021, 04:26:08 GMT

Many Natural Language Processing (NLP) techniques exist for detecting "fake news". Multi-phase algorithms with Determined Decision Trees, Gradient Enlargement, and others have been used by various researchers and organizations with varying results. One study from researchers at Rensselaer Polytechnic Institute reported 83% accuracy in predicting whether a news article is from a reliable or unreliable source [1], while Facebook's 2019 attempt at developing an algorithm failed miserably, with some users experiencing a "maelstrom" of fake news [2]. A new study, published in the November 2021 issue of the Journal of Emerging Technologies and Innovative Research [3] performs an analysis of a wide range of AI models for efficacy, finding that models generally perform poorly, ranging from 60% to 77% accuracy. Separating fake news from real news is a challenge even for the most sophisticated AI. Simple content-related programs and shallow marking of the speech part (POS) fail to consider contextual information and are unable to accurately classify news stories as fact or fake unless combined with more sophisticated algorithms.

accuracy, algorithm, machine learning, (10 more...)

#artificialintelligence

Country: Asia > India (0.05)

Genre: Research Report > New Finding (0.53)

Industry: Media > News (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.36)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.35)

Add feedback

ALP: Data Augmentation using Lexicalized PCFGs for Few-Shot Text Classification

Kim, Hazel, Woo, Daecheol, Oh, Seong Joon, Cha, Jeong-Won, Han, Yo-Sub

arXiv.org Artificial IntelligenceDec-16-2021

Data augmentation has been an important ingredient for boosting performances of learned models. Prior data augmentation methods for few-shot text classification have led to great performance boosts. However, they have not been designed to capture the intricate compositional structure of natural language. As a result, they fail to generate samples with plausible and diverse sentence structures. Motivated by this, we present the data Augmentation using Lexicalized Probabilistic context-free grammars (ALP) that generates augmented samples with diverse syntactic structures with plausible grammar. The lexicalized PCFG parse trees consider both the constituents and dependencies to produce a syntactic frame that maximizes a variety of word choices in a syntactically preservable manner without specific domain experts. Experiments on few-shot text classification tasks demonstrate that ALP enhances many state-of-the-art classification methods. As a second contribution, we delve into the train-val splitting methodologies when a data augmentation method comes into play. We argue empirically that the traditional splitting of training and validation sets is sub-optimal compared to our novel augmentation-based splitting strategies that further expand the training split with the same number of labeled data. Taken together, our contributions on the data augmentation strategies yield a strong training recipe for few-shot text classification tasks.

augmentation, augmentation method, data augmentation, (13 more...)

arXiv.org Artificial Intelligence

2112.11916

Country:

Asia > South Korea > Gyeongsangnam-do > Changwon (0.04)
North America > United States > New York (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.50)

Industry:

Education (0.93)
Media > Film (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Two-view Graph Neural Networks for Knowledge Graph Completion

Tong, Vinh, Nguyen, Dai Quoc, Phung, Dinh, Nguyen, Dat Quoc

arXiv.org Artificial IntelligenceDec-16-2021

To this end, we propose a new KG embedding model, named A knowledge graph (KG) is a network of entity nodes and WGE, to leverage GNNs to capture entity-focused graph structure relationship edges, which can be represented as a collection and relation-focused graph structure for KG completion. of triples in the form of (h, r, t), wherein each triple (h, r, In particular, WGE transforms a given KG into two views. The t) represents a relation r between a head entity h and a tail first view--a single undirected entity-focused graph--only entity t. Here, entities are real-world things or objects such includes entities as nodes to provide the entity neighborhood as music tracks, movies persons, organizations, places and the information. The second view--a single undirected relationfocused like, while each relation type determines a certain relationship graph--considers both entities and relations as nodes, between entities. KGs are used in a number of commercial applications, constructed from constraints (subjective relation, predicate e.g. in such search engines as Google, Microsoft's entity, objective relation), to attain the potential dependence Bing and Facebook's Graph search. They also are useful between two neighborhood relations. Then WGE introduces a resources for many natural language processing tasks such as new encoder module of adopting two vanilla GNNs directly co-reference resolution ([1], [2]), semantic parsing ([3], [4]) on these two graph views to better update entity and relation and question answering ([5], [6]). However, an issue is that embeddings, followed by the decoder module using a weighted KGs are often incomplete, i.e., missing a lot of valid triples score function. In summary, our contributions are as follows: [7].

graph, relation, vector representation, (15 more...)

arXiv.org Artificial Intelligence

2112.09231

Country:

Oceania > Australia (0.04)
North America > United States > New York (0.04)
Asia > India > West Bengal > Kolkata (0.04)
Asia > Vietnam (0.04)

Genre: Research Report (0.50)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Pay More Attention to History: A Context Modeling Strategy for Conversational Text-to-SQL

Li, Yuntao, Zhang, Hanchu, Li, Yutian, Wang, Sirui, Wu, Wei, Zhang, Yan

arXiv.org Artificial IntelligenceDec-16-2021

Conversational text-to-SQL aims at converting multi-turn natural language queries into their corresponding SQL representations. One of the most intractable problem of conversational text-to-SQL is modeling the semantics of multi-turn queries and gathering proper information required for the current query. This paper shows that explicit modeling the semantic changes by adding each turn and the summarization of the whole context can bring better performance on converting conversational queries into SQLs. In particular, we propose two conversational modeling tasks in both turn grain and conversation grain. These two tasks simply work as auxiliary training tasks to help with multi-turn conversational semantic parsing. We conducted empirical studies and achieve new state-of-the-art results on large-scale open-domain conversational text-to-SQL dataset. The results demonstrate that the proposed mechanism significantly improves the performance of multi-turn semantic parsing.

arxiv preprint arxiv, database schema, query, (13 more...)

arXiv.org Artificial Intelligence

2112.08735

Country: North America (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

The Rediscovery Hypothesis: Language Models Need to Meet Linguistics

Journal of Artificial Intelligence ResearchDec-15-2021

There is an ongoing debate in the NLP community whether modern language models contain linguistic knowledge, recovered through so-called probes. In this paper, we study whether linguistic knowledge is a necessary condition for the good performance of modern language models, which we call the rediscovery hypothesis. In the first place, we show that language models that are significantly compressed but perform well on their pretraining objectives retain good scores when probed for linguistic structures. This result supports the rediscovery hypothesis and leads to the second contribution of our paper: an information-theoretic framework that relates language modeling objectives with linguistic information. This framework also provides a metric to measure the impact of linguistic information on the word prediction task. We reinforce our analytical results with various experiments, both on synthetic and on real NLP tasks in English.

computational linguistic, information, representation, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.12788

AI Access Foundation

12788

Journal of Artificial Intelligence Research

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
(24 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Interscript: A dataset for interactive learning of scripts through error feedback

Tandon, Niket, Madaan, Aman, Clark, Peter, Sakaguchi, Keisuke, Yang, Yiming

arXiv.org Artificial IntelligenceDec-15-2021

How can an end-user provide feedback if a deployed structured prediction model generates inconsistent output, ignoring the structural complexity of human language? This is an emerging topic with recent progress in synthetic or constrained settings, and the next big leap would require testing and tuning models in real-world settings. We present a new dataset, Interscript, containing user feedback on a deployed model that generates complex everyday tasks. Interscript contains 8,466 data points -- the input is a possibly erroneous script and a user feedback, and the output is a modified script. We posit two use-cases of \ours that might significantly advance the state-of-the-art in interactive learning. The dataset is available at: https://github.com/allenai/interscript.

computational linguistic, dataset, interactive learning, (13 more...)

arXiv.org Artificial Intelligence

2112.07867

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Education > Educational Setting > Online (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.47)

Add feedback

Maximum Bayes Smatch Ensemble Distillation for AMR Parsing

Lee, Young-Suk, Astudillo, Ramon Fernandez, Hoang, Thanh Lam, Naseem, Tahira, Florian, Radu, Roukos, Salim

arXiv.org Artificial IntelligenceDec-14-2021

AMR parsing has experienced an unprecendented increase in performance in the last three years, due to a mixture of effects including architecture improvements and transfer learning. Self-learning techniques have also played a role in pushing performance forward. However, for most recent high performant parsers, the effect of self-learning and silver data generation seems to be fading. In this paper we show that it is possible to overcome this diminishing returns of silver data by combining Smatch-based ensembling techniques with ensemble distillation. In an extensive experimental setup, we push single model English parser performance above 85 Smatch for the first time and return to substantial gains. We also attain a new state-of-the-art for cross-lingual AMR parsing for Chinese, German, Italian and Spanish. Finally we explore the impact of the proposed distillation technique on domain adaptation, and show that it can produce gains rivaling those of human annotated data for QALD-9 and achieve a new state-of-the-art for BioAMR.

distillation, parser, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2112.0779

Country:

Europe > Bulgaria > Sofia City Province > Sofia (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Text to SQL Queries

#artificialintelligenceDec-8-2021, 02:08:35 GMT

WikiSQL is one of the most popular benchmarks in semantic parsing. It is a supervised text-to-SQL dataset, beautifully hand-annotated by Amazon Mechanical Turk. Some of the early works on WikiSQL modeled this as a sequence generation problem using seq2seq but we are moving away from it. The text has to be cleaned before passing it to the model like doing decontraction of the words, removing stop words, removing non-alphanumeric text from the corpus. As we have the dataset in SQL queries and headers, so we have to featurize the text using a tokenizer from the nltk library and then concatenate the query and headers.

query, sequence, sql query, (13 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.57)
Information Technology > Artificial Intelligence > Machine Learning (0.35)

Add feedback

Natural Answer Generation: From Factoid Answer to Full-length Answer using Grammar Correction

Jain, Manas, Saha, Sriparna, Bhattacharyya, Pushpak, Chinnadurai, Gladvin, Vatsa, Manish Kumar

arXiv.org Artificial IntelligenceDec-7-2021

Question Answering systems these days typically use template-based language generation. Though adequate for a domain-specific task, these systems are too restrictive and predefined for domain-independent systems. This paper proposes a system that outputs a full-length answer given a question and the extracted factoid answer (short spans such as named entities) as the input. Our system uses constituency and dependency parse trees of questions. A transformer-based Grammar Error Correction model GECToR (2020), is used as a post-processing step for better fluency. We compare our system with (i) Modified Pointer Generator (SOTA) and (ii) Fine-tuned DialoGPT for factoid questions. We also test our approach on existential (yes-no) questions with better results. Our model generates accurate and fluent answers than the state-of-the-art (SOTA) approaches. The evaluation is done on NewsQA and SqUAD datasets with an increment of 0.4 and 0.9 percentage points in ROUGE-1 score respectively. Also the inference time is reduced by 85\% as compared to the SOTA. The improved datasets used for our evaluation will be released as part of the research contribution.

computational linguistic, dataset, factoid answer, (14 more...)

arXiv.org Artificial Intelligence

2112.03849

Country:

North America > United States > Arizona > Maricopa County > Phoenix (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(9 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback