The Generation of Textual Entailment with NLML in an Intelligent Dialogue system for Language Learning CSIEC

arXiv.org Artificial Intelligence

This research report introduces the generation of textual entailment within the project CSIEC (Computer Simulation in Educational Communication), an interactive web-based human-computer dialogue system with natural language for English instruction. The generation of textual entailment (GTE) is critical to the further improvement of CSIEC project. Up to now we have found few literatures related with GTE. Simulating the process that a human being learns English as a foreign language we explore our naive approach to tackle the GTE problem and its algorithm within the framework of CSIEC, i.e. rule annotation in NLML, pattern recognition (matching), and entailment transformation. The time and space complexity of our algorithm is tested with some entailment examples. Further works include the rules annotation based on the English textbooks and a GUI interface for normal users to edit the entailment rules.


Criterion SM Online Essay Evaluation: An Application for Automated Evaluation of Student Essays

AAAI Conferences

Online Essay Evaluation Service, a web-based system that provides automated scoring and evaluation of student essays. Criterion has two complementary applications: E-rater, an automated essay scoring system and Critique Writing Analysis Tools, a suite of programs that detect errors in grammar, usage, and mechanics, that identify discourse elements in the essay, and that recognize elements of undesirable style. These evaluation capabilities provide students with feedback that is specific to their writing in order to help them improve their writing skills. Both applications employ natural language processing and machine learning techniques. All of these capabilities outperform baseline algorithms, and some of the tools agree with human judges as often as two judges agree with each other.


Improving Argument Mining in Student Essays by Learning and Exploiting Argument Indicators versus Essay Topics

AAAI Conferences

Argument mining systems for student essays need to be able to reliably identify argument components independently of particular essay topics. Thus in addition to features that model argumentation through topic-independent linguistic indicators such as discourse markers, features that can abstract over lexical signals of particular essay topics might also be helpful to improve performance. Prior argument mining studies have focused on persuasive essays and proposed a variety of largely lexicalized features. Our current study examines the utility of such features, proposes new features to abstract over the domain topics of essays, and conducts evaluations using both 10-fold cross validation as well as cross-topic validation. Experimental results show that our proposed features significantly improve argument mining performance in both types of cross-fold evaluation settings. Feature ablation studies further shed light on relative feature utility.


Grammatical Error Detection for Corrective Feedback Provision in Oral Conversations

AAAI Conferences

The demand for computer-assisted language learning systems that can provide corrective feedback on language learners’ speaking has increased. However, it is not a trivial task to detect grammatical errors in oral conversations because of the unavoidable errors of automatic speech recognition systems. To provide corrective feedback, a novel method to detect grammatical errors in speaking performance is proposed. The proposed method consists of two sub-models: the grammaticality-checking model and the error-type classification model. We automatically generate grammatical errors that learners are likely to commit and construct error patterns based on the articulated errors. When a particular speech pattern is recognized, the grammaticality-checking model performs a binary classification based on the similarity between the error patterns and the recognition result using the confidence score. The error-type classification model chooses the error type based on the most similar error pattern and the error frequency extracted from a learner corpus. The grammaticality checking method largely outperformed the two comparative models by 56.36% and 42.61% in F-score while keeping the false positive rate very low. The error-type classification model exhibited very high performance with a 99.6% accuracy rate. Because high precision and a low false positive rate are important criteria for the language-tutoring setting, the proposed method will be helpful for intelligent computer-assisted language learning systems.


How Silicon Valley is teaching language to machines

#artificialintelligence

The dream of building computers or robots that communicate like humans has been with us for many decades now. And if market trends and investment levels are any guide, it's something we would really like to have. MarketsandMarkets says the natural language processing (NLP) industry will be worth $16.07 billion by 2021, growing at a rate of 16.1 percent, and deep learning is estimated to reach $1.7 billion by 2022, growing at a CAGR of 65.3 percent between 2016 and 2022. Of course, if you've played with any chatbots, you will know that it's a promise that is yet to be fulfilled. There's an "uncanny valley" where, at one end, we sense we're not talking to a real person and, at the other end, the machine just doesn't "get" what we mean.