Question Generation by Transformers

Kriangchaivech, Kettip, Wangperawong, Artit

Sep-14-2019–arXiv.org Artificial Intelligence

Kettip Kriangchaivech 1 and Artit Wangperawong 2 1 kettipk@gmail.com 2 artit.wangperawong@usbank.com U.S. Bank 1095 Avenue of the Americas New Y ork, NY 10036 Abstract A machine learning model was developed to automatically generate questions from Wikipedia passages using transformers, an attention-based model eschewing the paradigm of existing recurrent neural networks (RNNs). The model was trained on the inverted Stanford Question Answering Dataset (SQuAD), which is a reading comprehension dataset consisting of 100,000 questions posed by crowdworkers on a set of Wikipedia articles. After training, the question generation model is able to generate simple questions relevant to unseen passages and answers containing an average of 8 words per question. The word error rate (WER) was used as a metric to compare the similarity between SQuAD questions and the model-generated questions. Although the high average WER suggests that the questions generated differ from the original SQuAD questions, the questions generated are mostly grammatically correct and plausible in their own right. Introduction Existing question generating systems reported in the literature involve human-generated templates, including cloze type (Hermann et al. 2015), rule-based (Mitkov and Ha 2003; Rus et al. 2010), or semiautomatic questions ( Alvaro and Alvaro 2010; Rey et al. 2012; Liu and Lin 2014). On the other hand, machine learned models developed recently have used recurrent neural networks (RNNs) to perform sequence transduction, i.e. sequence-to-sequence (Du, Shao, and Cardie 2017; Kim et al. 2019). In this work, we investigated an automatic question generation system based on a machine learning model that uses transformers instead of RNNs (V aswani et al. 2017; Wangperawong 2018).

machine learning, natural language, question answering, (20 more...)

arXiv.org Artificial Intelligence

Sep-14-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States > California (0.68)

Genre:
- Research Report (0.82)

Industry:
- Leisure & Entertainment > Sports > Football (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Question Answering (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found