AITopics | Grammars & Parsing

Collaborating Authors

Grammars & Parsing

News Overviews Instructional Materials AI-Alerts Classics

Focused Contrastive Training for Test-based Constituency Analysis

arXiv.org Artificial IntelligenceSep-30-2021

We propose a scheme for self-training of grammaticality models for constituency analysis based on linguistic tests. A pre-trained language model is fine-tuned by contrastive estimation of grammatical sentences from a corpus, and ungrammatical sentences that were perturbed by a syntactic test, a transformation that is motivated by constituency theory. We show that consistent gains can be achieved if only certain positive instances are chosen for training, depending on whether they could be the result of a test transformation. This way, the positives, and negatives exhibit similar characteristics, which makes the objective more challenging for the language model, and also allows for additional markup that indicates the position of the test application within the sentence.

computational linguistic, grammaticality model, transformation, (14 more...)

arXiv.org Artificial Intelligence

2109.15159

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Austria > Vienna (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
(4 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.47)

Add feedback

Visually Grounded Concept Composition

Zhang, Bowen, Hu, Hexiang, Qiu, Linlu, Shaw, Peter, Sha, Fei

arXiv.org Artificial IntelligenceSep-28-2021

We investigate ways to compose complex concepts in texts from primitive ones while grounding them in images. We propose Concept and Relation Graph (CRG), which builds on top of constituency analysis and consists of recursively combined concepts with predicate functions. Meanwhile, we propose a concept composition neural network called Composer to leverage the CRG for visually grounded concept learning. Specifically, we learn the grounding of both primitive and all composed concepts by aligning them to images and show that learning to compose leads to more robust grounding results, measured in text-to-image matching accuracy. Notably, our model can model grounded concepts forming at both the finer-grained sentence level and the coarser-grained intermediate level (or word-level). Composer leads to pronounced improvement in matching accuracy when the evaluation data has significant compound divergence from the training data.

omposer, predicate, transformer, (16 more...)

arXiv.org Artificial Intelligence

2109.14115

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Sports (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Importance of Resume Parsing in Candidate Screening Stage of Hiring

#artificialintelligenceSep-26-2021, 07:05:06 GMT

Resume parsing is a tool that analyzes a CV or a resume document and converts it into structured information for reporting, storing, analyzing, and screening. For a long time now, resumes have been screened and shortlisted manually. Recruiters would have to look through each resume separately and screen them based on skills, experience, education, etc. This process of shortlisting candidates took an immense amount of time and increased the cost of hire. This method also makes the company lose quality candidates as recruiters usually go through thousands of resumes and then shortlist until one is selected.

candidate information, information, resume parser, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.84)

Add feedback

The limitations of limited context for constituency parsing

AIHubSep-24-2021, 13:03:24 GMT

Compare the above two sentences "I drink coffee with milk" and "I drink coffee with friends". They only differ at their very last words, but their parses differ at earlier places, too. Now imagine you read sentences like these. This might be a daunting task when the sentences get longer and their structures more complex. In our work, we show that this task is also difficult for some leading machine learning models for parsing.

pcfg, representational power, syntactic distance, (16 more...)

AIHub

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Pushing on Text Readability Assessment: A Transformer Meets Handcrafted Linguistic Features

Lee, Bruce W., Jang, Yoo Sung, Lee, Jason Hyung-Jong

arXiv.org Artificial IntelligenceSep-24-2021

We report two essential improvements in readability assessment: 1. three novel features in advanced semantics and 2. the timely evidence that traditional ML models (e.g. Random Forest, using handcrafted features) can combine with transformers (e.g. RoBERTa) to augment model performance. First, we explore suitable transformers and traditional ML models. Then, we extract 255 handcrafted linguistic features using self-developed extraction software. Finally, we assemble those to create several hybrid models, achieving state-of-the-art (SOTA) accuracy on popular datasets in readability assessment. The use of handcrafted features help model performance on smaller datasets. Notably, our RoBERTA-RF-T1 hybrid achieves the near-perfect classification accuracy of 99%, a 20.3% increase from the previous SOTA.

average count, dataset, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2109.12258

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry:

Education (0.67)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.67)

Add feedback

Sparse Fuzzy Attention for Structured Sentiment Analysis

Peng, Letian, Li, Zuchao, Zhao, Hai

arXiv.org Artificial IntelligenceSep-24-2021

Attention scorers have achieved success in parsing tasks like semantic and syntactic dependency parsing. However, in tasks modeled into parsing, like structured sentiment analysis, "dependency edges" are very sparse which hinders parser performance. Thus we propose a sparse and fuzzy attention scorer with pooling layers which improves parser performance and sets the new state-of-the-art on structured sentiment analysis. We further explore the parsing modeling on structured sentiment analysis with second-order parsing and introduce a novel sparse second-order edge building procedure that leads to significant improvement in parsing performance.

computational linguistic, dependency, sentiment analysis, (15 more...)

arXiv.org Artificial Intelligence

2109.06719

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Shanghai > Shanghai (0.05)
Europe > Italy > Tuscany > Florence (0.05)
(10 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

What is NLP technology and why use it? Isahit, Ethical Data Labeling Platform for AI & Data Processing

#artificialintelligenceSep-23-2021, 19:25:02 GMT

Natural language processing (NLP) is a revolutionary technology that enables machines to understand and communicate with human language, using AI. How can a computer, which usually understands a precise, marked out and structured programming language, understand the imprecise and ambiguous human language? For François Yvon, a researcher specialising in NLP, this technology means " all research and development aimed at modelling and reproducing, with the help of machines, the human capacity to produce and understand linguistic statements for communication purposes ». The ability to understand human language is not a new concept; as early as 1950, the famous mathematician Alan Turing proposed a test that assessed the intelligence of a machine by its ability to hold a human conversation. It was during the 1950s that the first NLP tests were carried out.

ai & data processing, human language, language processing, (10 more...)

#artificialintelligence

Industry: Information Technology > Software (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.32)

Add feedback

CSAGN: Conversational Structure Aware Graph Network for Conversational Semantic Role Labeling

Wu, Han, Xu, Kun, Song, Linqi

arXiv.org Artificial IntelligenceSep-23-2021

Conversational semantic role labeling (CSRL) is believed to be a crucial step towards dialogue understanding. However, it remains a major challenge for existing CSRL parser to handle conversational structural information. In this paper, we present a simple and effective architecture for CSRL which aims to address this problem. Our model is based on a conversational structure-aware graph network which explicitly encodes the speaker dependent information. We also propose a multi-task learning method to further improve the model. Experimental results on benchmark datasets show that our model with our proposed training objectives significantly outperforms previous baselines.

dataset, representation, utterance, (15 more...)

arXiv.org Artificial Intelligence

2109.11541

Country:

Asia > China > Hong Kong (0.05)
Asia > China > Guangdong Province > Shenzhen (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
(7 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.71)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.61)

Add feedback

Finding a Balanced Degree of Automation for Summary Evaluation

Zhang, Shiyue, Bansal, Mohit

arXiv.org Artificial IntelligenceSep-23-2021

Human evaluation for summarization tasks is reliable but brings in issues of reproducibility and high costs. Automatic metrics are cheap and reproducible but sometimes poorly correlated with human judgment. In this work, we propose flexible semiautomatic to automatic summary evaluation metrics, following the Pyramid human evaluation method. Semi-automatic Lite2Pyramid retains the reusable human-labeled Summary Content Units (SCUs) for reference(s) but replaces the manual work of judging SCUs' presence in system summaries with a natural language inference (NLI) model. Fully automatic Lite3Pyramid further substitutes SCUs with automatically extracted Semantic Triplet Units (STUs) via a semantic role labeling (SRL) model. Finally, we propose in-between metrics, Lite2.xPyramid, where we use a simple regressor to predict how well the STUs can simulate SCUs and retain SCUs that are more difficult to simulate, which provides a smooth transition and balance between automation and manual evaluation. Comparing to 15 existing metrics, we evaluate human-metric correlations on 3 existing meta-evaluation datasets and our newly-collected PyrXSum (with 100/10 XSum examples/systems). It shows that Lite2Pyramid consistently has the best summary-level correlations; Lite3Pyramid works better than or comparable to other automatic metrics; Lite2.xPyramid trades off small correlation drops for larger manual effort reduction, which can reduce costs for future data collection. Our code and data are publicly available at: https://github.com/ZhangShiyue/Lite2-3Pyramid

catherine nevin, lite 2, pyramid, (15 more...)

arXiv.org Artificial Intelligence

2109.11503

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Netherlands (0.04)
North America > United States > Maryland > Montgomery County > Gaithersburg (0.04)
(6 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.67)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Exploring Decomposition for Table-based Fact Verification

Yang, Xiaoyu, Zhu, Xiaodan

arXiv.org Artificial IntelligenceSep-22-2021

Fact verification based on structured data is challenging as it requires models to understand both natural language and symbolic operations performed over tables. Although pre-trained language models have demonstrated a strong capability in verifying simple statements, they struggle with complex statements that involve multiple operations. In this paper, we improve fact verification by decomposing complex statements into simpler subproblems. Leveraging the programs synthesized by a weakly supervised semantic parser, we propose a program-guided approach to constructing a pseudo dataset for decomposition model training. The subproblems, together with their predicted answers, serve as the intermediate evidence to enhance our fact verification model. Experiments show that our proposed approach achieves the new state-of-the-art performance, an 82.7\% accuracy, on the TabFact benchmark.

computational linguistic, decomposition, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2109.1102

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.05)
Asia > China > Hong Kong (0.05)
(12 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.48)

Add feedback