AITopics | test session

Collaborating Authors

test session

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

BaRISTA: Brain Scale Informed Spatiotemporal Representation of Human Intracranial Neural Activity

Neural Information Processing SystemsJun-23-2026, 02:50:31 GMT

Intracranial recordings have opened a unique opportunity to simultaneously measure activity across multiregional networks in the human brain. Recent works have focused on developing transformer-based neurofoundation models of such recordings that can generalize across subjects and datasets.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Configurable multi-agent framework for scalable and realistic testing of llm-based agents

Wang, Sai, Subramanian, Senthilnathan, Sahni, Mudit, Gone, Praneeth, Meng, Lingjie, Wang, Xiaochen, Bertoli, Nicolas Ferradas, Cheng, Tingxian, Xu, Jun

arXiv.org Artificial IntelligenceJul-22-2025

Large-language-model (LLM) agents exhibit complex, context-sensitive behaviour that quickly renders static benchmarks and ad-hoc manual testing obsolete. We present Neo, a configurable, multi-agent framework that automates realistic, multi-turn evaluation of LLM-based systems. Neo couples a Question Generation Agent and an Evaluation Agent through a shared context-hub, allowing domain prompts, scenario controls and dynamic feedback to be composed modularly. Test inputs are sampled from a probabilistic state model spanning dialogue flow, user intent and emotional tone, enabling diverse, human-like conversations that adapt after every turn. Applied to a production-grade Seller Financial Assistant chatbot, Neo (i) uncovered edge-case failures across five attack categories with a 3.3% break rate close to the 5.8% achieved by expert human red-teamers, and (ii) delivered 10-12X higher throughput, generating 180 coherent test questions in around 45 mins versus 16h of human effort. Beyond security probing, Neo's stochastic policies balanced topic coverage and conversational depth, yielding broader behavioural exploration than manually crafted scripts. Neo therefore lays a foundation for scalable, self-evolving LLM QA: its agent interfaces, state controller and feedback loops are model-agnostic and extensible to richer factual-grounding and policy-compliance checks. We release the framework to facilitate reproducible, high-fidelity testing of emerging agentic systems.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2507.14705

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

AI-assisted Gaze Detection for Proctoring Online Exams

Shih, Yong-Siang, Zhao, Zach, Niu, Chenhao, Iberg, Bruce, Sharpnack, James, Baig, Mirza Basim

arXiv.org Artificial IntelligenceSep-25-2024

For high-stakes online exams, it is important to detect potential rule violations to ensure the security of the test. In this study, we investigate the task of detecting whether test takers are looking away from the screen, as such behavior could be an indication that the test taker is consulting external resources. For asynchronous proctoring, the exam videos are recorded and reviewed by the proctors. However, when the length of the exam is long, it could be tedious for proctors to watch entire exam videos to determine the exact moments when test takers look away. We present an AI-assisted gaze detection system, which allows proctors to navigate between different video frames and discover video frames where the test taker is looking in similar directions. The system enables proctors to work more effectively to identify suspicious moments in videos. An evaluation framework is proposed to evaluate the system against human-only and ML-only proctoring, and a user study is conducted to gather feedback from proctors, aiming to demonstrate the effectiveness of the system.

proctor, test session, test taker, (13 more...)

arXiv.org Artificial Intelligence

2409.16923

Genre:

Research Report (0.71)
Questionnaire & Opinion Survey (0.58)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Evaluating the capability of large language models to personalize science texts for diverse middle-school-age learners

Vaccaro, Michael Jr, Friday, Mikayla, Zaghi, Arash

arXiv.org Artificial IntelligenceAug-9-2024

Evaluating the capability of large language models to personalize science texts for diverse middle-school-age learners Michael Vaccaro Jr. Abstract Large language models (LLMs), including OpenAI's GPT-series, have made significant advancements in recent years. Known for their expertise across diverse subject areas and quick adaptability to user-provided prompts, LLMs hold unique potential as Personalized Learning (PL) tools. Despite this potential, their application in K-12 education remains largely unexplored. This paper presents one of the first randomized controlled trials (n = 23) to evaluate the effectiveness of GPT-4 in personalizing educational science texts for middle school students. In this study, GPT-4 was used to profile student learning preferences based on choices made during a training session. For the experimental group, GPT-4 was used to rewrite science texts to align with the student's predicted profile while, for students in the control group, texts were rewritten to contradict their learning preferences. The results of a Mann-Whitney U test showed that students significantly preferred (at the.10 level) the rewritten texts when they were aligned with their profile (p =.059). These findings suggest that GPT-4 can effectively interpret and tailor educational content to diverse learner preferences, marking a significant advancement in PL technology. The limitations of this study and ethical considerations for using artificial intelligence in education are also discussed. Keywords: Large Language Models (LLMs), GPT-4, Personalized Learning, AI Generated Content (AIGC), Randomized Controlled Trial (RCT), K-12 Education 1 Introduction In 2008, the National Academy of Engineering named advancements in Personalized Learning (PL) one of the fourteen grand challenges for the twenty-first century (National Academy of Engineering, 2008). Since this time, PL has emerged as a prominent area of education research. Through this work, PL has evolved into a broad term which now encompasses a vast number of interventions and programs (Shemshack and Spector, 2020; Walkington and Bernacki, 2020). The work presented in this paper aims to build on this existing research by investigating the potential of novel Large Language Models (LLMs) to foster highly adaptive PL environments.

participant, student, test session, (17 more...)

arXiv.org Artificial Intelligence

2408.05204

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Connecticut > Tolland County > Storrs (0.14)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > K-12 Education > Middle School (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

Add feedback

Human-in-the-Loop AI for Cheating Ring Detection

Shih, Yong-Siang, Liao, Manqian, Liu, Ruidong, Baig, Mirza Basim

arXiv.org Artificial IntelligenceMar-18-2024

Online exams have become popular in recent years due to their accessibility. However, some concerns have been raised about the security of the online exams, particularly in the context of professional cheating services aiding malicious test takers in passing exams, forming so-called "cheating rings". In this paper, we introduce a human-in-the-loop AI cheating ring detection system designed to detect and deter these cheating rings. We outline the underlying logic of this human-in-the-loop AI system, exploring its design principles tailored to achieve its objectives of detecting cheaters. Moreover, we illustrate the methodologies used to evaluate its performance and fairness, aiming to mitigate the unintended risks associated with the AI system. The design and development of the system adhere to Responsible AI (RAI) standards, ensuring that ethical considerations are integrated throughout the entire development process.

cheating ring, test session, test taker, (12 more...)

arXiv.org Artificial Intelligence

2403.14711

Genre: Research Report (0.65)

Industry: Education (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Add feedback

Dynamic In-Context Learning from Nearest Neighbors for Bundle Generation

Sun, Zhu, Feng, Kaidong, Yang, Jie, Qu, Xinghua, Fang, Hui, Ong, Yew-Soon, Liu, Wenyuan

arXiv.org Artificial IntelligenceDec-26-2023

Product bundling has evolved into a crucial marketing strategy in e-commerce. However, current studies are limited to generating (1) fixed-size or single bundles, and most importantly, (2) bundles that do not reflect consistent user intents, thus being less intelligible or useful to users. This paper explores two interrelated tasks, i.e., personalized bundle generation and the underlying intent inference based on users' interactions in a session, leveraging the logical reasoning capability of large language models. We introduce a dynamic in-context learning paradigm, which enables ChatGPT to seek tailored and dynamic lessons from closely related sessions as demonstrations while performing tasks in the target session. Specifically, it first harnesses retrieval augmented generation to identify nearest neighbor sessions for each target session. Then, proper prompts are designed to guide ChatGPT to perform the two tasks on neighbor sessions. To enhance reliability and mitigate the hallucination issue, we develop (1) a self-correction strategy to foster mutual improvement in both tasks without supervision signals; and (2) an auto-feedback mechanism to recurrently offer dynamic supervision based on the distinct mistakes made by ChatGPT on various neighbor sessions. Thus, the target session can receive customized and dynamic lessons for improved performance by observing the demonstrations of its neighbor sessions. Finally, experimental results on three real-world datasets verify the effectiveness of our methods on both tasks. Additionally, the inferred intents can prove beneficial for other intriguing downstream tasks, such as crafting appealing bundle names.

bundle, proceedings, recommendation, (15 more...)

arXiv.org Artificial Intelligence

2312.16262

Country:

North America > United States > District of Columbia > Washington (0.05)
Asia > Singapore (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback