Assessment & Standards

An interdisciplinary approach to artificial intelligence testing - JAXenter


JAXenter: The term'intelligence' is not easy to understand. What's the best way to explain it and how can we apply it to machines? Marisa Tschopp: Human intelligence has been a very controversial topic and has undergone dramatic changes in history since the beginnings in the early 19th century. Intelligence gained importance especially in the educational context as these "mental abilities" were the best predictors for success in school and aimed to place students into the right classes. There are various, very elaborated theories, that define human intelligence.

crater: Automatic Content Scoring for Short Constructed Responses

AAAI Conferences

The education community is moving towards constructed or free-text responses and computer-based assessment. At the same time, progress in natural language processing and knowledge representation has made it possible to consider free-text or constructed responses without having to fully understand the text.

Want to take a Mensa intelligence test? Here are four practice questions

Popular Science

Enrollment in the society, founded in 1946, is open only to individuals who score in the 98th percentile or higher on a pre-approved intelligence test. To join this clever club, hopefuls must demonstrate excellence in verbal-, spatial-, and mathematical-reasoning skills. These four problems are what today's aspirants might see on a typical IQ assessment.

Neural Automated Essay Scoring and Coherence Modeling for Adversarially Crafted Input Artificial Intelligence

We demonstrate that current state-of-the-art approaches to Automated Essay Scoring (AES) are not well-suited to capturing adversarially crafted input of grammatical but incoherent sequences of sentences. We develop a neural model of local coherence that can effectively learn connectedness features between sentences, and propose a framework for integrating and jointly training the local coherence model with a state-of-the-art AES model. We evaluate our approach against a number of baselines and experimentally demonstrate its effectiveness on both the AES task and the task of flagging adversarial input, further contributing to the development of an approach that strengthens the validity of neural essay scoring models.

Trustworthy Automated Essay Scoring without Explicit Construct Validity

AAAI Conferences

Automated essay scoring (AES) is a broadly used application of machine learning, with a long history of real-world use that impacts high-stakes decision-making for students. However, defensibility arguments in this space have typically been rooted in hand-crafted features and psychometrics research, which are a poor fit for recent advances in AI research and more formative classroom use of the technology. This paper proposes a framework for evaluating automated essay scoring models trained with more modern algorithms, used in a classroom setting; that framework is then applied to evaluate an existing product, Turnitin Revision Assistant.

Measuring Cognitive Bias in Spoken Interaction and Conversation:

AAAI Conferences

The present approach targets to assist decision-making by identifying and by-passing Cognitive Bias of speakers-participants and evaluators in regard to spoken texts.

Microsoft is using machine reading to create a 'literate machine'


If you asked most people, they'd probably say that computers and other gadgets are pretty good at communicating information to us, whether it's by providing directions to an important business meeting or finding the best recipe for gluten-free apple pie. And yet, computers still don't communicate with us nearly as intuitively as we communicate with each other. If you type a query into a search engine, for example, chances are you'll get a list of websites to click on. But if you ask a person a question, she'll respond with an answer, or perhaps ask another question to get more information before answering. Microsoft is hoping to improve how well computers can communicate information to us.

Accenture launches artificial intelligence testing services


IT services and consulting company Accenture is launching new services for testing artificial intelligence systems to help companies build own AI-driven products and services based locally or on the cloud.

Accenture Launches New Artificial Intelligence Testing Services


Accenture Launches New Artificial Intelligence Testing Services Powered by a "Teach and Test" methodology, the new services help companies validate the safety, reliability and transparency of their artificial intelligence systems NEW YORK; Feb. 20, 2018 – Accenture (NYSE: ACN) has launched new services for testing artificial intelligence (AI) systems, powered by a unique "Teach and Test" methodology designed to help companies build, monitor and measure reliable AI systems within their own infrastructure or in the cloud. Accenture's "Teach and Test" methodology ensures that AI systems are producing the right decisions in two phases. The "Teach" phase focuses on the choice of data, models and algorithms that are used to train machine learning. This phase experiments and statistically evaluates different models to select the best performing model to be deployed into production, while avoiding gender, ethnic and other biases, as well as ethical and compliance risks. Accenture AI Testing Services from Accenture Technology During the "Test" phase, AI system outputs are compared to key performance indicators, and assessed for whether the system can explain how a decision or outcome was determined.