AITopics | subtest

Collaborating Authors

subtest

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Machine Learning-Based Framework to Shorten the Questionnaire for Assessing Autism Intervention

Dong, Audrey, Xu, Claire, Guo, Samuel R., Yang, Kevin, Kong, Xue-Jun

arXiv.org Artificial IntelligenceNov-3-2025

Caregivers of individuals with autism spectrum disorder (ASD) often find the 77-item Autism Treatment Evaluation Checklist (ATEC) burdensome, limiting its use for routine monitoring. This study introduces a generalizable machine learning framework that seeks to shorten assessments while maintaining evaluative accuracy. Using longitudinal ATEC data from 60 autistic children receiving therapy, we applied feature selection and cross-validation techniques to identify the most predictive items across two assessment goals: longitudinal therapy tracking and point-in-time severity estimation. For progress monitoring, the framework identified 16 items (21% of the original questionnaire) that retained strong correlation with total score change and full subdomain coverage. We also generated smaller subsets (1-7 items) for efficient approximations. For point-in-time severity assessment, our model achieved over 80% classification accuracy using just 13 items (17% of the original set). While demonstrated on ATEC, the methodology-based on subset optimization, model interpretability, and statistical rigor-is broadly applicable to other high-dimensional psychometric tools. The resulting framework could potentially enable more accessible, frequent, and scalable assessments and offer a data-driven approach for AI-supported interventions across neurodevelopmental and psychiatric contexts.

artificial intelligence, machine learning, questionnaire, (13 more...)

arXiv.org Artificial Intelligence

2510.26808

Country: North America > United States > Texas (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Autism (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.54)

Add feedback

Efficiency Without Cognitive Change: Evidence from Human Interaction with Narrow AI Systems

Benítez, María Angélica, Ceballos, Rocío Candela, Molina, Karina Del Valle, Araujo, Sofía Mundo, Villaroel, Sofía Evangelina Victorio, Justel, Nadia

arXiv.org Artificial IntelligenceOct-30-2025

The growing integration of artificial intelligence (AI) into human cognition raises a fundamental question: does AI merely improve efficiency, or does it alter how we think? This study experimentally tested whether short-term exposure to narrow AI tools enhances core cognitive abilities or simply optimizes task performance. Thirty young adults completed standardized neuropsychological assessments embedded in a seven-week protocol with a four-week online intervention involving problem-solving and verbal comprehension tasks, either with or without AI support (ChatGPT). While AI-assisted participants completed several tasks faster and more accurately, no significant pre-post differences emerged in standardized measures of problem solving or verbal comprehension. These results demonstrate efficiency gains without cognitive change, suggesting that current narrow AI systems serve as cognitive scaffolds extending performance without transforming underlying mental capacities. The findings highlight the need for ethical and educational frameworks that promote critical and autonomous thinking in an increasingly AI-augmented cognitive ecology.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.24893

Country: Europe (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Efficient Ensemble Conditional Independence Test Framework for Causal Discovery

Guan, Zhengkang, Kuang, Kun

arXiv.org Machine LearningSep-26-2025

Constraint-based causal discovery relies on numerous conditional independence tests (CITs), but its practical applicability is severely constrained by the prohibitive computational cost, especially as CITs themselves have high time complexity with respect to the sample size. To address this key bottleneck, we introduce the Ensemble Conditional Independence Test (E-CIT), a general and plug-and-play framework. E-CIT operates on an intuitive divide-and-aggregate strategy: it partitions the data into subsets, applies a given base CIT independently to each subset, and aggregates the resulting p-values using a novel method grounded in the properties of stable distributions. This framework reduces the computational complexity of a base CIT to linear in the sample size when the subset size is fixed. Moreover, our tailored p-value combination method offers theoretical consistency guarantees under mild conditions on the subtests. Experimental results demonstrate that E-CIT not only significantly reduces the computational burden of CITs and causal discovery but also achieves competitive performance. Notably, it exhibits an improvement in complex testing scenarios, particularly on real-world datasets.

ensemble, stable distribution, type, (14 more...)

arXiv.org Machine Learning

2509.21021

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Washington > King County > Bellevue (0.04)
North America > Canada > Quebec > Montreal (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (0.72)
Research Report > New Finding (0.66)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)

Add feedback

Exploring the Potential of Large Language Models for Estimating the Reading Comprehension Question Difficulty

Jain, Yoshee, Hollander, John, He, Amber, Tang, Sunny, Zhang, Liang, Sabatini, John

arXiv.org Artificial IntelligenceFeb-24-2025

Reading comprehension is a key for individual success, yet the assessment of question difficulty remains challenging due to the extensive human annotation and large-scale testing required by traditional methods such as linguistic analysis and Item Response Theory (IRT). While these robust approaches provide valuable insights, their scalability is limited. There is potential for Large Language Models (LLMs) to automate question difficulty estimation; however, this area remains underexplored. Our study investigates the effectiveness of LLMs, specifically OpenAI's GPT-4o and o1, in estimating the difficulty of reading comprehension questions using the Study Aid and Reading Assessment (SARA) dataset. We evaluated both the accuracy of the models in answering comprehension questions and their ability to classify difficulty levels as defined by IRT. The results indicate that, while the models yield difficulty estimates that align meaningfully with derived IRT parameters, there are notable differences in their sensitivity to extreme item characteristics. These findings suggest that LLMs can serve as the scalable method for automated difficulty assessment, particularly in dynamic interactions between learners and Adaptive Instructional Systems (AIS), bridging the gap between traditional psychometric techniques and modern AIS for reading comprehension and paving the way for more adaptive and personalized educational assessments. The manuscript has been accepted for presentation at the 27th International Conference on Human-Computer Interaction in Gothenburg, Sweden, from June 22-27, 2025.

assessment, language model, question difficulty, (12 more...)

arXiv.org Artificial Intelligence

2502.17785

Country:

Europe > Sweden > Vaestra Goetaland > Gothenburg (0.24)
North America > United States > Illinois > Champaign County > Urbana (0.14)
North America > United States > Tennessee > Shelby County > Memphis (0.05)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education > Assessment & Standards > Student Performance (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)

Add feedback

The Cognitive Capabilities of Generative AI: A Comparative Analysis with Human Benchmarks

Galatzer-Levy, Isaac R., Munday, David, McGiffin, Jed, Liu, Xin, Karmon, Danny, Labzovsky, Ilia, Moroshko, Rivka, Zait, Amir, McDuff, Daniel

arXiv.org Artificial IntelligenceOct-9-2024

There is increasing interest in tracking the capabilities of general intelligence foundation models. This study benchmarks leading large language models and vision language models against human performance on the Wechsler Adult Intelligence Scale (WAIS-IV), a comprehensive, population-normed assessment of underlying human cognition and intellectual abilities, with a focus on the domains of VerbalComprehension (VCI), Working Memory (WMI), and Perceptual Reasoning (PRI). Most models demonstrated exceptional capabilities in the storage, retrieval, and manipulation of tokens such as arbitrary sequences of letters and numbers, with performance on the Working Memory Index (WMI) greater or equal to the 99.5th percentile when compared to human population normative ability. Performance on the Verbal Comprehension Index (VCI) which measures retrieval of acquired information, and linguistic understanding about the meaning of words and their relationships to each other, also demonstrated consistent performance at or above the 98th percentile. Despite these broad strengths, we observed consistently poor performance on the Perceptual Reasoning Index (PRI; range 0.1-10th percentile) from multimodal models indicating profound inability to interpret and reason on visual information. Smaller and older model versions consistently performed worse, indicating that training data, parameter count and advances in tuning are resulting in significant advances in cognitive ability.

cognitive capability, comparative analysis, generative ai, (12 more...)

arXiv.org Artificial Intelligence

2410.07391

Country:

North America > United States (0.28)
Asia (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.67)

Add feedback

Early Stopping Based on Repeated Significance

Bax, Eric, Sarkar, Arundhyoti, Shtoff, Alex

arXiv.org Artificial IntelligenceAug-1-2024

For a bucket test with a single criterion for success and a fixed number of samples or testing period, requiring a $p$-value less than a specified value of $\alpha$ for the success criterion produces statistical confidence at level $1 - \alpha$. For multiple criteria, a Bonferroni correction that partitions $\alpha$ among the criteria produces statistical confidence, at the cost of requiring lower $p$-values for each criterion. The same concept can be applied to decisions about early stopping, but that can lead to strict requirements for $p$-values. We show how to address that challenge by requiring criteria to be successful at multiple decision points.

criteria, decision point, probability, (16 more...)

arXiv.org Artificial Intelligence

2408.00908

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Unveiling the General Intelligence Factor in Language Models: A Psychometric Approach

Ilić, David

arXiv.org Artificial IntelligenceDec-6-2023

This study uncovers the factor of general intelligence, or g, in language models, extending the psychometric theory traditionally applied to humans and certain animal species. Utilizing factor analysis on two extensive datasets - Open LLM Leaderboard with 1,232 models and General Language Understanding Evaluation (GLUE) Leaderboard with 88 models - we find compelling evidence for a unidimensional, highly stable g factor that accounts for 85% of the variance in model performance. The study also finds a moderate correlation of .49 between model size and g. The discovery of g in language models offers a unified metric for model evaluation and opens new avenues for more robust, g-based model ability assessment. These findings lay the foundation for understanding and future research on artificial general intelligence from a psychometric perspective and have practical implications for model evaluation and development.

intelligence, language model, subtest, (11 more...)

arXiv.org Artificial Intelligence

2310.11616

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.37)

Add feedback

Detection of developmental language disorder in Cypriot Greek children using a machine learning neural network algorithm

Georgiou, Georgios P., Theodorou, Elena

arXiv.org Artificial IntelligenceNov-25-2023

Children with developmental language disorder (DLD) encounter difficulties in acquiring various language structures. Early identification and intervention are crucial to prevent negative long-term outcomes impacting the academic, social, and emotional development of children. The study aims to develop an automated method for the identification of DLD using artificial intelligence, specifically a neural network machine learning algorithm. This protocol is applied for the first time in a Cypriot Greek child population with DLD. The neural network model was trained using perceptual and production data elicited from 15 children with DLD and 15 healthy controls in the age range of 7;10 - 10;4. The k-fold technique was used to crossvalidate the algorithm. The performance of the model was evaluated using metrics such as accuracy, precision, recall, F1 score, and ROC/AUC curve to assess its ability to make accurate predictions on a set of unseen data. The results demonstrated high classification values for all metrics, indicating the high accuracy of the neural model in classifying children with DLD. Additionally, the variable importance analysis revealed that the language production skills of children had a more significant impact on the performance of the model compared to perception skills. Machine learning paradigms provide effective discrimination between children with DLD and those with TD, with the potential to enhance clinical assessment and facilitate earlier and more efficient detection of the disorder.

algorithm, developmental language disorder, disorder, (11 more...)

arXiv.org Artificial Intelligence

2311.15054

Country:

Europe > Austria > Vienna (0.14)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)
Europe > Middle East > Cyprus > Limassol > Limassol (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Thinking in PolAR Pictures: Using Rotation-Friendly Mental Images to Solve Leiter-R Form Completion

Palmer, Joshua H. (Vanderbilt University) | Kunda, Maithilee (Vanderbilt University)

AAAI ConferencesFeb-8-2018

The Leiter International Performance Scale-Revised (Leiter-R) is a standardized cognitive test that seeks to "provide a nonverbal measure of general intelligence by sampling a wide variety of functions from memory to nonverbal reasoning." Understanding the computational building blocks of nonverbal cognition, as measured by the Leiter-R, is an important step towards understanding human nonverbal cognition, especially with respect to typical and atypical trajectories of child development. One subtest of the Leiter-R, Form Completion, involves synthesizing and localizing a visual figure from its constituent slices. Form Completion poses an interesting nonverbal problem that seems to combine several aspects of visual memory, mental rotation, and visual search. We describe a new computational cognitive model that addresses Form Completion using a novel, mental-rotation-friendly image representation that we call the Polar Augmented Resolution (PolAR) Picture, which enables high-fidelity mental rotation operations. We present preliminary results using actual Leiter-R test items and discuss directions for future work.

artificial intelligence, simulation of human behavior, subslice, (18 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.93)
Education (0.93)

Technology: Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.48)

Add feedback

An Approach to Evaluate AI Commonsense Reasoning Systems

Ohlsson, Stellan (University of Illinois at Chicago) | Sloan, Robert H. (University of Illinois at Chicago) | Turan, Gyorgy (University of Szeged) | Uber, Daniel (University of Illinois at Chicago) | Urasky, Aaron (University of Illinois at Chicago)

AAAI ConferencesMay-20-2012

We propose and give a preliminary test of a new metric for the quality of the commonsense knowledge and reasoning of large AI databases: Using the same measurement as is used for a four-year-old, namely, an IQ test for young children. We report on results obtained us- ing test questions we wrote in the spirit of the questions of the Wechsler Preschool and Primary Scale of Intelligence, Third Edition (WPPSI-III) on the ConceptNet system, which were, on the whole, quite strong.

conceptnet, subtest, wppsi-iii, (14 more...)

AAAI Conferences

Twenty-Fifth International FLAIRS Conference

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Hungary > Csongrád-Csanád County > Szeged (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback