AITopics | Grammars & Parsing

Collaborating Authors

Grammars & Parsing

News Overviews Instructional Materials AI-Alerts Classics

Na\"iveRole: Author-Contribution Extraction and Parsing from Biomedical Manuscripts

Tkaczyk, Dominika, Collins, Andrew, Beel, Joeran

arXiv.org Machine LearningDec-15-2019

Information about the contributions of individual authors to scientific publications is important for assessing authors' achievements. Some biomedical publications have a short section that describes authors' roles and contributions. It is usually written in natural language and hence author contributions cannot be trivially extracted in a machine-readable format. In this paper, we present 1) A statistical analysis of roles in author contributions sections, and 2) Na\"iveRole, a novel approach to extract structured authors' roles from author contribution sections. For the first part, we used co-clustering techniques, as well as Open Information Extraction, to semi-automatically discover the popular roles within a corpus of 2,000 contributions sections from PubMed Central. The discovered roles were used to automatically build a training set for Na\"iveRole, our role extractor approach, based on Na\"ive Bayes. Na\"iveRole extracts roles with a micro-averaged precision of 0.68, recall of 0.48 and F1 of 0.57. It is, to the best of our knowledge, the first attempt to automatically extract author roles from research papers. This paper is an extended version of a previous poster published at JCDL 2018.

corpus, manuscript, role mention, (15 more...)

arXiv.org Machine Learning

1912.1017

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

An Unsupervised Domain-Independent Framework for Automated Detection of Persuasion Tactics in Text

Iyer, Rahul Radhakrishnan, Sycara, Katia

arXiv.org Artificial IntelligenceDec-13-2019

With the increasing growth of social media, people have started relying heavily on the information shared therein to form opinions and make decisions. While such a reliance is motivation for a variety of parties to promote information, it also makes people vulnerable to exploitation by slander, misinformation, terroristic and predatorial advances. In this work, we aim to understand and detect such attempts at persuasion. Existing works on detecting persuasion in text make use of lexical features for detecting persuasive tactics, without taking advantage of the possible structures inherent in the tactics used. We formulate the task as a multi-class classification problem and propose an unsupervised, domain-independent machine learning framework for detecting the type of persuasion used in text, which exploits the inherent sentence structure present in the different persuasion tactics. Our work shows promising results as compared to existing work.

argument, category, unsupervised domain-independent framework, (12 more...)

arXiv.org Artificial Intelligence

1912.06745

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Monterey County > Monterey (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Media > News (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.73)

Add feedback

salesforce/decaNLP

#artificialintelligenceDec-5-2019, 04:10:26 GMT

The Natural Language Decathlon is a multitask challenge that spans ten tasks: question answering (SQuAD), machine translation (IWSLT), summarization (CNN/DM), natural language inference (MNLI), sentiment analysis (SST), semantic role labeling(QA‑SRL), zero-shot relation extraction (QA‑ZRE), goal-oriented dialogue (WOZ, semantic parsing (WikiSQL), and commonsense reasoning (MWSC). Each task is cast as question answering, which makes it possible to use our new Multitask Question Answering Network (MQAN). This model jointly learns all tasks in decaNLP without any task-specific modules or parameters in the multitask setting. For a more thorough introduction to decaNLP and the tasks, see the main website, our blog post, or the paper. While the research direction associated with this repository focused on multitask learning, the framework itself is designed in a way that should make single-task training, transfer learning, and zero-shot evaluation simple.

decanlp, metric, natural language decathlon, (11 more...)

#artificialintelligence

Industry: Information Technology > Software (0.53)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.58)

Add feedback

Design and implementation of an open source Greek POS Tagger and Entity Recognizer using spaCy

Partalidou, Eleni, Spyromitros-Xioufis, Eleftherios, Doropoulos, Stavros, Vologiannidis, Stavros, Diamantaras, Konstantinos I.

arXiv.org Machine LearningDec-5-2019

This paper proposes a machine learning approach to part-of-speech tagging and named entity recognition for Greek, focusing on the extraction of morphological features and classification of tokens into a small set of classes for named entities. The architecture model that was used is introduced. The greek version of the spaCy platform was added into the source code, a feature that did not exist before our contribution, and was used for building the models. Additionally, a part of speech tagger was trained that can detect the morphology of the tokens and performs higher than the state-of-the-art results when classifying only the part of speech. For named entity recognition using spaCy, a model that extends the standard ENAMEX type (organization, location, person) was built. Certain experiments that were conducted indicate the need for flexibility in out-of-vocabulary words and there is an effort for resolving this issue. Finally, the evaluation results are discussed.

dataset, experiment, vector, (12 more...)

arXiv.org Machine Learning

1912.10162

Country:

Europe > Greece > Central Macedonia > Thessaloniki (0.05)
North America > United States (0.04)
North America > Canada > British Columbia (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction

Guillou, Liane, Hardmeier, Christian, Nakov, Preslav, Stymne, Sara, Tiedemann, Jörg, Versley, Yannick, Cettolo, Mauro, Webber, Bonnie, Popescu-Belis, Andrei

arXiv.org Artificial IntelligenceNov-27-2019

We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction. This is a classification task in which participants are asked to provide predictions on what pronoun class label should replace a placeholder value in the target-language text, provided in lemma-tised and PoS-tagged form. We provided four subtasks, for the English-French and English-German language pairs, in both directions. Eleven teams participated in the shared task; nine for the English-French subtask, five for French-English, nine for English-German, and six for German-English. Most of the submissions outperformed two strong language-model- based baseline systems, with systems using deep recurrent neural networks outperforming those using other architectures for most language pairs.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

1911.12091

Country:

Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.14)
Europe > Germany > Berlin (0.05)
Europe > Sweden > Uppsala County > Uppsala (0.04)
(28 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(4 more...)

Add feedback

Feature-Rich Part-of-speech Tagging for Morphologically Complex Languages: Application to Bulgarian

Georgiev, Georgi, Zhikov, Valentin, Osenova, Petya, Simov, Kiril, Nakov, Preslav

arXiv.org Artificial IntelligenceNov-26-2019

Unlike most previous work, which has used a small number of grammatical categories, we work with 680 morpho-syntactic tags. W e combine a large morphological lexicon with prior linguistic knowledge and guided learning from a POSannotated corpus, achieving accuracy of 97.98%, which is a significant improvement over the state-of-the-art for Bulgarian.

accuracy, lexicon, proceedings, (14 more...)

arXiv.org Artificial Intelligence

1911.11503

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Czechia > Prague (0.05)
Europe > Bulgaria > Sofia City Province > Sofia (0.04)
(21 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

SemEval-2015 Task 3: Answer Selection in Community Question Answering

Nakov, Preslav, Màrquez, Lluís, Magdy, Walid, Moschitti, Alessandro, Glass, James, Randeree, Bilal

arXiv.org Artificial IntelligenceNov-26-2019

Community Question Answering (cQA) provides new interesting research directions to the traditional Question Answering (QA) field, e.g., the exploitation of the interaction between users and the structure of related posts. In this context, we organized SemEval-2015 Task 3 on "Answer Selection in cQA", which included two subtasks: (a) classifying answers as "good", "bad", or "potentially relevant" with respect to the question, and (b) answering a YES/NO question with "yes", "no", or "unsure", based on the list of all answers. We set subtask A for Arabic and English on two relatively different cQA domains, i.e., the Qatar Living website for English, and a Quran-related website for Arabic. We used crowdsourcing on Amazon Mechanical Turk to label a large English training dataset, which we released to the research community. Thirteen teams participated in the challenge with a total of 61 submissions: 24 primary and 37 contrastive. The best systems achieved an official score (macro-averaged F1) of 57.19 and 63.7 for the English subtasks A and B, and 78.55 for the Arabic subtask A.

accuracy, proceedings, subtask, (15 more...)

arXiv.org Artificial Intelligence

1911.11403

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Colorado > Denver County > Denver (0.05)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)
(16 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.91)
(2 more...)

Add feedback

Filling Conversation Ellipsis for Better Social Dialog Understanding

Zhang, Xiyuan, Li, Chengxi, Yu, Dian, Davidson, Samuel, Yu, Zhou

arXiv.org Artificial IntelligenceNov-25-2019

The phenomenon of ellipsis is prevalent in social conversations. Ellipsis increases the difficulty of a series of downstream language understanding tasks, such as dialog act prediction and semantic role labeling. We propose to resolve ellipsis through automatic sentence completion to improve language understanding. However, automatic ellipsis completion can result in output which does not accurately reflect user intent. To address this issue, we propose a method which considers both the original utterance that has ellipsis and the automatically completed utterance in dialog act and semantic role labeling tasks. Specifically, we first complete user utterances to resolve ellipsis using an end-to-end pointer network model. We then train a prediction model using both utterances containing ellipsis and our automatically completed utterances. Finally, we combine the prediction results from these two utterances using a selection model that is guided by expert knowledge. Our approach improves dialog act prediction and semantic role labeling by 1.3% and 2.5% in F1 score respectively in social conversations. We also present an open-domain human-machine conversation dataset with manually completed user utterances and annotated semantic role labeling after manual completion. Introduction Ellipsis, in which a speaker omits words that are understood from context, is a frequent phenomenon in human conversation. Although natural to humans, ellipsis poses a challenge for language understanding in spoken dialog systems.

ellipsis, semantic role, utterance, (13 more...)

arXiv.org Artificial Intelligence

1911.10776

Country:

North America > Canada (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Yolo County > Davis (0.04)
Asia > Japan > Hokkaidō > Hokkaidō Prefecture > Sapporo (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

NLP News Cypher 11.24.19

#artificialintelligenceNov-24-2019, 23:03:22 GMT

The French RoBERTa, aka CamemBERT, is now part of Hugging Face's transformer library. The transformer achieves state-of-the-art (SOTA) results on several NLP downstream tasks: part-of-speech tagging, dependency parsing, named-entity recognition, and natural language inference in French. Need a cheat-sheet for data science or ML? Thanks to this fellow, the biggest payload of cheat sheets in the galaxy covers several programming languages and use-cases is easily accessible on GitHub. One of the biggest retailers on planet Earth is turning to Conversational AI. This past week Walmart announced its partnership with Apple's Siri for peeps looking to buy groceries online -- the service is called Walmart Voice Order.

audience, nlp news cypher 11

#artificialintelligence

Industry: Retail (0.79)

Technology: