AITopics

2510.17916

Country: North America > United States (0.93)

Genre:

Research Report (0.50)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Dakshit, Sagnik, Roy, Sushmita Sinha

Interpretability Framework for LLMs in Undergraduate Calculus

arXiv.org Artificial IntelligenceOct-22-2025

Large Language Models (LLMs) are increasingly being used in education, yet their correctness alone does not capture the quality, reliability, or pedagogical validity of their problem-solving behavior, especially in mathematics, where multistep logic, symbolic reasoning, and conceptual clarity are critical. Conventional evaluation methods largely focus on final answer accuracy and overlook the reasoning process. To address this gap, we introduce a novel interpretability framework for analyzing LLM-generated solutions using undergraduate calculus problems as a representative domain. Our approach combines reasoning flow extraction and decomposing solutions into semantically labeled operations and concepts with prompt ablation analysis to assess input salience and output stability. Using structured metrics such as reasoning complexity, phrase sensitivity, and robustness, we evaluated the model behavior on real Calculus I to III university exams. Our findings revealed that LLMs often produce syntactically fluent yet conceptually flawed solutions, with reasoning patterns sensitive to prompt phrasing and input variation. This framework enables fine-grained diagnosis of reasoning failures, supports curriculum alignment, and informs the design of interpretable AI-assisted feedback tools. This is the first study to offer a structured, quantitative, and pedagogically grounded framework for interpreting LLM reasoning in mathematics education, laying the foundation for the transparent and responsible deployment of AI in STEM learning environments.

large language model, machine learning, natural language, (19 more...)

2510.1791

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (0.93)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Education > Educational Setting > Higher Education (0.93)
Education > Educational Technology > Educational Software > Computer Based Training (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Taguchi, Chihiro, Sproat, Richard

IASC: Interactive Agentic System for ConLangs

arXiv.org Artificial IntelligenceOct-22-2025

We present a system that uses LLMs as a tool in the development of Constructed Languages. The system is modular in that one first creates a target phonology for the language using an agentic approach that refines its output at each step with commentary feedback on its previous attempt. Next, a set of sentences is 'translated' from their English original into a morphosyntactic markup that reflects the word order and morphosyntactic feature specifications of the desired target language, with affixes represented as morphosyntactic feature bundles. From this translated corpus, a lexicon is constructed using the phonological model and the set of morphemes (stems and affixes) extracted from the 'translated' sentences. The system is then instructed to provide an orthography for the language, using an existing script such as Latin or Cyrillic. Finally, the system writes a brief grammatical handbook of the language. The system can also translate further sentences into the target language. Our goal is twofold. First, we hope that these tools will be fun to use for creating artificially constructed languages. Second, we are interested in exploring what LLMs 'know' about language-not what they know about any particular language or linguistic phenomenon, but how much they know about and understand language and linguistic concepts. As we shall see, there is a fairly wide gulf in capabilities both among different LLMs and among different linguistic specifications, with it being notably easier for systems to deal with more common patterns than rarer ones. An additional avenue that we explore is the application of our approach to translating from high-resource into low-resource languages. While the results so far are mostly negative, we provide some evidence that an improved version of the present system could afford some real gains in such tasks. https://github.com/SakanaAI/IASC

large language model, machine learning, natural language, (20 more...)

2510.07591

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Massachusetts (0.27)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Nouraei, Farnaz, Yong, Zhuorui, Bickmore, Timothy

HealthDial: A No-Code LLM-Assisted Dialogue Authoring Tool for Healthcare Virtual Agents

We introduce HealthDial, a dialogue authoring tool that helps healthcare providers and educators create virtual agents that deliver health education and counseling to patients over multiple conversations. HealthDial leverages large language models (LLMs) to automatically create an initial session-based plan and conversations for each session using text-based patient health education materials as input. Authored dialogue is output in the form of finite state machines for virtual agent delivery so that all content can be validated and no unsafe advice is provided resulting from LLM hallucinations. LLM-drafted dialogue structure and language can be edited by the author in a no-code user interface to ensure validity and optimize clarity and impact. We conducted a feasibility and usability study with counselors and students to test our approach with an authoring task for cancer screening education. Participants used HealthDial and then tested their resulting dialogue by interacting with a 3D-animated virtual agent delivering the dialogue. Through participants' evaluations of the task experience and final dialogues, we show that HealthDial provides a promising first step for counselors to ensure full coverage of their health education materials, while creating understandable and actionable virtual agent dialogue with patients.

healthdial, large language model, natural language, (17 more...)

2510.15898

Country:

Europe (0.67)
North America > United States (0.46)

Genre:

Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Instructional Material (1.00)
Research Report > New Finding (0.94)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Consumer Health (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind

Han, Peixuan, Liu, Zijia, You, Jiaxuan

Large language models (LLMs) have shown promising potential in persuasion, but existing works on training LLM persuaders are still preliminary. Notably, while humans are skilled in modeling their opponent's thoughts and opinions proactively and dynamically, current LLMs struggle with such Theory of Mind (ToM) reasoning, resulting in limited diversity and opponent awareness. To address this limitation, we introduce Theory of Mind Augmented Persuader (ToMAP), a novel approach for building more flexible persuader agents by incorporating two theory of mind modules that enhance the persuader's awareness and analysis of the opponent's mental state. Specifically, we begin by prompting the persuader to consider possible objections to the target central claim, and then use a text encoder paired with a trained MLP classifier to predict the opponent's current stance on these counterclaims. Our carefully designed reinforcement learning schema enables the persuader learns how to analyze opponent-related information and utilize it to generate more effective arguments. Experiments show that the ToMAP persuader, while containing only 3B parameters, outperforms much larger baselines, like GPT-4o, with a relative gain of 39.4% across multiple persuadee models and diverse corpora. Notably, ToMAP exhibits complex reasoning chains and reduced repetition during training, which leads to more diverse and effective arguments. The opponent-aware feature of ToMAP also makes it suitable for long conversations and enables it to employ more logical and opponent-aware strategies. These results underscore our method's effectiveness and highlight its potential for developing more persuasive language agents. Code is available at: https://github.com/ulab-uiuc/ToMAP.

large language model, machine learning, natural language, (17 more...)

2505.22961

Country: North America > United States (1.00)

Genre:

Research Report > Experimental Study (0.46)
Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)
Instructional Material > Course Syllabus & Notes (0.34)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
Government (1.00)
Banking & Finance > Economy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Diaz-Bone, Leander, Bagatella, Marco, Hübotter, Jonas, Krause, Andreas

DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning

Sparse-reward reinforcement learning (RL) can model a wide range of highly complex tasks. Solving sparse-reward tasks is RL's core premise, requiring efficient exploration coupled with long-horizon credit assignment, and overcoming these challenges is key for building self-improving agents with superhuman ability. Prior work commonly explores with the objective of solving many sparse-reward tasks, making exploration of individual high-dimensional, long-horizon tasks intractable. We argue that solving such challenging tasks requires solving simpler tasks that are relevant to the target task, i.e., whose achieval will teach the agent skills required for solving the target task. We demonstrate that this sense of direction, necessary for effective exploration, can be extracted from existing RL algorithms, without leveraging any prior information. To this end, we propose a method for directed sparse-reward goal-conditioned very long-horizon RL (DISCOVER), which selects exploratory goals in the direction of the target task. We connect DISCOVER to principled exploration in bandits, formally bounding the time until the target task becomes achievable in terms of the agent's initial distance to the target, but independent of the volume of the space of all tasks. We then perform a thorough evaluation in high-dimensional environments. We find that the directed goal selection of DISCOVER solves exploration problems that are beyond the reach of prior state-of-the-art exploration methods in RL.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

2505.1985

Country: Europe > Switzerland (0.28)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.64)

Industry:

Leisure & Entertainment > Games (1.00)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Abramson, Corey M., Yuhan, null, Nian, null

The Cultural Mapping and Pattern Analysis (CMAP) Visualization Toolkit: Open Source Text Analysis for Qualitative and Computational Social Science

The CMAP (Cultural Mapping and Pattern Analysis) visualization toolkit is an open-source suite for analyzing and visualizing text data--from qualitative fieldnotes and in-depth interview transcripts to historical documents and web-scraped data such as message board posts or blogs. The toolkit is designed for scholars integrating pattern analysis, data visualization, and explanation in qualitative and/or computational social science (CSS). Despite the existence of off-the-shelf commercial qualitative data analysis software, there remains a shortage of highly scalable open-source options capable of handling large datasets and supporting advanced statistical and language modeling. The foundation of the toolkit is a pragmatic approach that aligns research tools with social science project goals--empirical explanation, theory-guided measurement, comparative design, or evidence-based recommendations--guided by the principle that research paradigms and questions should determine methods. Consequently, the CMAP visualization toolkit offers a wide range of possibilities through the adjustment of a relatively small number of parameters and allows seamless integration with other Python tools.

artificial intelligence, data mining, natural language, (17 more...)

2510.1614

Country: North America > United States > California > San Francisco County > San Francisco (0.28)

Genre:

Research Report (0.64)
Personal > Interview (0.54)
Instructional Material > Online (0.40)
Instructional Material > Course Syllabus & Notes (0.40)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.86)

Liow, Wei Ting, Khan, Sumbul, Ang, Lay Kee

Co-Designing Interdisciplinary Design Projects with AI

T his work has been submitted to the IEEE for possible publication. ORCID: 0000 -0003-2811-1194 Abstract --Creating interdisciplinary design projects is time-consuming and cognitively demanding for teachers, requiring curriculum alignment, cross -subject integration, and careful sequencing. This paper presents the Interdisciplinary Design Project Planner (IDPplanner), a GPT -based planning assistant grounded in Design Innovation principles, al ignment with Singapore secondary school's syllabuses, and 21st -century competencies. In a within -subject, counterbalanced workshop with 33 in -service teachers, participants produced two versions of the same project: manual and AI -assisted, followed by self - and peer-evaluations using a six -dimensional rubric. AI -assisted version received higher scores for Curriculum Alignment, Design Thinking Application, and Coherence & Flow, with a marginal advantage for Assessment Strategies. Teacher reflections indicated that AI -assisted planning improved structure, sequencing, and idea generation, while contextualization to local syllabuses, class profiles, and student needs remained teacher-led. Contributions include (1) a purpose-built planning tool that organizes ideas into a ten - component flow with ready-to -adapt prompts, templates, and assessment suggestions; (2) an empirical, rubric -based comparison of plan ning quality; and (3) evidence that AI can function as a pedagogical planning partner . Recommendations emphasize hybrid teacher-AI workflows to enhance curriculum alignment and reduce planning complexity, and design suggestions for developers to strengthen contextual customization, iterative design support, and l ocalized rubrics. Although instantiated with a Singapore -based curriculum, the planning flow and rubric are framework -agnostic and can be parameterized for other systems. Interdisciplinary learning approaches have gained prominence globally, particularly as countries prioritize 21st-century competencies (21CC) such as creativity, problem - solving, collaboration, and adaptive thinking.

artificial intelligence, machine learning, natural language, (18 more...)

2510.16068

Country: Asia > Singapore (0.48)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Research Report > Experimental Study > Negative Result (0.68)

Industry:

Education > Educational Setting > Higher Education (1.00)
Education > Curriculum (1.00)
Education > Educational Setting > K-12 Education > Secondary School (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

FinFlowRL: An Imitation-Reinforcement Learning Framework for Adaptive Stochastic Control in Finance

Li, Yang, Chen, Zhi

Traditional stochastic control methods in finance struggle in real world markets due to their reliance on simplifying assumptions and stylized frameworks. Such methods typically perform well in specific, well defined environments but yield suboptimal results in changed, non stationary ones. We introduce FinFlowRL, a novel framework for financial optimal stochastic control. The framework pretrains an adaptive meta policy learning from multiple expert strategies, then finetunes through reinforcement learning in the noise space to optimize the generative process. By employing action chunking generating action sequences rather than single decisions, it addresses the non Markovian nature of markets. FinFlowRL consistently outperforms individually optimized experts across diverse market conditions.

finflowrl, machine learning, reinforcement learning, (19 more...)

2510.15883

Country:

Europe (0.28)
North America > United States (0.28)

Genre:

Research Report > New Finding (0.46)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Girard, Samuel, Bibaut, Aurélien, Zenati, Houssam

Online Policy Learning via a Self-Normalized Maximal Inequality

arXiv.org Machine LearningOct-20-2025

Adaptive experiments produce dependent data that break i.i.d. assumptions that underlie classical concentration bounds and invalidate standard learning guarantees. In this paper, we develop a self-normalized maximal inequality for martingale empirical processes. Building on this, we first propose an adaptive sample-variance penalization procedure which balances empirical loss and sample variance, valid for general dependent data. Next, this allows us to derive a new variance-regularized pessimistic off-policy learning objective, for which we establish excess-risk guarantees. Subsequently, we show that, when combined with sequential updates and under standard complexity and margin conditions, the resulting estimator achieves fast convergence rates in both parametric and nonparametric regimes, improving over the usual $1/\sqrt{n}$ baseline. We complement our theoretical findings with numerical simulations that illustrate the practical gains of our approach.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2510.15483

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Sardinia (0.04)

Genre:

Research Report > New Finding (0.46)
Instructional Material > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.46)