AITopics | Zhang, Mengxue

Collaborating Authors

Zhang, Mengxue

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Modeling and Analyzing Scorer Preferences in Short-Answer Math Questions

Zhang, Mengxue, Heffernan, Neil, Lan, Andrew

arXiv.org Artificial IntelligenceJun-1-2023

Automated scoring of student responses to open-ended questions, including short-answer questions, has great potential to scale to a large number of responses. Recent approaches for automated scoring rely on supervised learning, i.e., training classifiers or fine-tuning language models on a small number of responses with human-provided score labels. However, since scoring is a subjective process, these human scores are noisy and can be highly variable, depending on the scorer. In this paper, we investigate a collection of models that account for the individual preferences and tendencies of each human scorer in the automated scoring task. We apply these models to a short-answer math response dataset where each response is scored (often differently) by multiple different human scorers. We conduct quantitative experiments to show that our scorer models lead to improved automated scoring accuracy. We also conduct quantitative experiments and case studies to analyze the individual preferences and tendencies of scorers. We found that scorers can be grouped into several obvious clusters, with each cluster having distinct features, and analyzed them in detail.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2306.00791

Genre: Research Report (0.82)

Industry:

Education > Educational Technology > Educational Software > Computer-Aided Assessment (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Interpretable Math Word Problem Solution Generation Via Step-by-step Planning

Zhang, Mengxue, Wang, Zichao, Yang, Zhichao, Feng, Weiqi, Lan, Andrew

arXiv.org Artificial IntelligenceJun-1-2023

Solutions to math word problems (MWPs) with step-by-step explanations are valuable, especially in education, to help students better comprehend problem-solving strategies. Most existing approaches only focus on obtaining the final correct answer. A few recent approaches leverage intermediate solution steps to improve final answer correctness but often cannot generate coherent steps with a clear solution strategy. Contrary to existing work, we focus on improving the correctness and coherence of the intermediate solutions steps. We propose a step-by-step planning approach for intermediate solution generation, which strategically plans the generation of the next solution step based on the MWP and the previous solution steps. Our approach first plans the next step by predicting the necessary math operation needed to proceed, given history steps, then generates the next step, token-by-token, by prompting a language model with the predicted math operation. Experiments on the GSM8K dataset demonstrate that our approach improves the accuracy and interpretability of the solution on both automatic metrics and human evaluation.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.00784

Country:

North America > United States > Massachusetts (0.14)
Asia > Middle East > UAE (0.14)

Genre:

Workflow (0.68)
Research Report (0.64)

Industry: Education > Educational Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Algebra Error Classification with Large Language Models

McNichols, Hunter, Zhang, Mengxue, Lan, Andrew

arXiv.org Artificial IntelligenceMay-8-2023

Automated feedback as students answer open-ended math questions has significant potential in improving learning outcomes at large scale. A key part of automated feedback systems is an error classification component, which identifies student errors and enables appropriate, predefined feedback to be deployed. Most existing approaches to error classification use a rule-based method, which has limited capacity to generalize. Existing data-driven methods avoid these limitations but specifically require mathematical expressions in student responses to be parsed into syntax trees. This requirement is itself a limitation, since student responses are not always syntactically valid and cannot be converted into trees. In this work, we introduce a flexible method for error classification using pre-trained large language models. We demonstrate that our method can outperform existing methods in algebra error classification, and is able to classify a larger set of student responses. Additionally, we analyze common classification errors made by our method and discuss limitations of automated error classification.

information, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2305.06163

Genre: Research Report (0.66)

Industry:

Education > Assessment & Standards > Student Performance (0.88)
Education > Educational Technology > Educational Software > Computer Based Training (0.70)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Evaluating the Performance of Reinforcement Learning Algorithms

Jordan, Scott M., Chandak, Yash, Cohen, Daniel, Zhang, Mengxue, Thomas, Philip S.

arXiv.org Machine LearningAug-13-2020

Performance evaluations are critical for quantifying algorithmic advances in reinforcement learning. Recent reproducibility analyses have shown that reported performance results are often inconsistent and difficult to replicate. In this work, we argue that the inconsistency of performance stems from the use of flawed evaluation metrics. Taking a step towards ensuring that reported results are consistent, we propose a new comprehensive evaluation methodology for reinforcement learning algorithms that produces reliable measurements of performance both on a single environment and when aggregated across environments. We demonstrate this method by evaluating a broad class of reinforcement learning algorithms on standard benchmark tasks.

algorithm, artificial intelligence, survey article, (17 more...)

arXiv.org Machine Learning

2006.16958

Country:

North America > United States > Massachusetts (0.14)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Industry:

Government (0.67)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

An Improved Algorithm for Learning to Perform Exception-Tolerant Abduction

Zhang, Mengxue (Washington University in St. Louis) | Mathew, Tushar (Washington University in St. Louis) | Juba, Brendan A. (Washington University in St. Louis)

AAAI ConferencesFeb-14-2017

Inference from an observed or hypothesized condition to a plausible cause or explanation for this condition is known as abduction. For many tasks, the acquisition of the necessary knowledge by machine learning has been widely found to be highly effective. However, the semantics of learned knowledge are weaker than the usual classical semantics, and this necessitates new formulations of many tasks. We focus on a recently introduced formulation of the abductive inference task that is thus adapted to the semantics of machine learning. A key problem is that we cannot expect that our causes or explanations will be perfect, and they must tolerate some error due to the world being more complicated than our formalization allows. This is a version of the qualification problem, and in machine learning, this is known as agnostic learning. In the work by Juba that introduced the task of learning to make abductive inferences, an algorithm is given for producing k-DNF explanations that tolerates such exceptions: if the best possible k-DNF explanation fails to justify the condition with probability ε, then the algorithm is promised to find a k-DNF explanation that fails to justify the condition with probability at most O(nkε), where n is the number of propositional attributes used to describe the domain. Here, we present an improved algorithm for this task. When the best k- DNF fails with probability ε, our algorithm finds a k-DNF that fails with probability at most O ̃(nk/2ε) (i.e., suppressing logarithmic factors in n and 1/ε). We also examine the empirical advantage of this new algorithm over the previous algorithm in two test domains, one of explaining conditions generated by a “noisy” k-DNF rule, and another of explaining conditions that are actually generated by a linear threshold rule.

abductive reasoning, algorithm, logic programming, (21 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country:

North America > United States (0.46)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.25)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Abductive Reasoning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.65)

Add feedback