Instructional Material
Self-Evidencing Through Hierarchical Gradient Decomposition: A Dissipative System That Maintains Non-Equilibrium Steady-State by Minimizing Variational Free Energy
The Free Energy Principle (FEP) states that self-organizing systems must minimize variational free energy to persist (Friston, 2010, 2019), but the path from principle to implementable algorithm has remained unclear. We present a constructive proof that the FEP can be realized through exact local credit assignment. The system decomposes gradient computation hierarchically: spatial credit via feedback alignment, temporal credit via eligibility traces, and structural credit via a Trophic Field Map (TFM) that estimates expected gradient magnitude for each connection block. We prove these mechanisms are exact at their respective levels and validate the central claim empirically: the TFM achieves 0.9693 Pearson correlation with oracle gradients. This exactness produces emergent capabilities including 98.6% retention after task interference, autonomous recovery from 75% structural damage, self-organized criticality (spectral radius ρ 1.0), and sample-efficient reinforcement learning on continuous control tasks without replay buffers. The architecture unifies Pri-gogine's dissipative structures (Prigogine, 1977), Fris-ton's free energy minimization (Friston, 2010), and Hopfield's attractor dynamics (Hopfield, 1982; Amit et al., 1985a,b), demonstrating that exact hierarchical inference over network topology can be implemented with local, biologically plausible rules.
Interpretability Framework for LLMs in Undergraduate Calculus
Dakshit, Sagnik, Roy, Sushmita Sinha
Large Language Models (LLMs) are increasingly being used in education, yet their correctness alone does not capture the quality, reliability, or pedagogical validity of their problem-solving behavior, especially in mathematics, where multistep logic, symbolic reasoning, and conceptual clarity are critical. Conventional evaluation methods largely focus on final answer accuracy and overlook the reasoning process. To address this gap, we introduce a novel interpretability framework for analyzing LLM-generated solutions using undergraduate calculus problems as a representative domain. Our approach combines reasoning flow extraction and decomposing solutions into semantically labeled operations and concepts with prompt ablation analysis to assess input salience and output stability. Using structured metrics such as reasoning complexity, phrase sensitivity, and robustness, we evaluated the model behavior on real Calculus I to III university exams. Our findings revealed that LLMs often produce syntactically fluent yet conceptually flawed solutions, with reasoning patterns sensitive to prompt phrasing and input variation. This framework enables fine-grained diagnosis of reasoning failures, supports curriculum alignment, and informs the design of interpretable AI-assisted feedback tools. This is the first study to offer a structured, quantitative, and pedagogically grounded framework for interpreting LLM reasoning in mathematics education, laying the foundation for the transparent and responsible deployment of AI in STEM learning environments.
IASC: Interactive Agentic System for ConLangs
Taguchi, Chihiro, Sproat, Richard
We present a system that uses LLMs as a tool in the development of Constructed Languages. The system is modular in that one first creates a target phonology for the language using an agentic approach that refines its output at each step with commentary feedback on its previous attempt. Next, a set of sentences is 'translated' from their English original into a morphosyntactic markup that reflects the word order and morphosyntactic feature specifications of the desired target language, with affixes represented as morphosyntactic feature bundles. From this translated corpus, a lexicon is constructed using the phonological model and the set of morphemes (stems and affixes) extracted from the 'translated' sentences. The system is then instructed to provide an orthography for the language, using an existing script such as Latin or Cyrillic. Finally, the system writes a brief grammatical handbook of the language. The system can also translate further sentences into the target language. Our goal is twofold. First, we hope that these tools will be fun to use for creating artificially constructed languages. Second, we are interested in exploring what LLMs 'know' about language-not what they know about any particular language or linguistic phenomenon, but how much they know about and understand language and linguistic concepts. As we shall see, there is a fairly wide gulf in capabilities both among different LLMs and among different linguistic specifications, with it being notably easier for systems to deal with more common patterns than rarer ones. An additional avenue that we explore is the application of our approach to translating from high-resource into low-resource languages. While the results so far are mostly negative, we provide some evidence that an improved version of the present system could afford some real gains in such tasks. https://github.com/SakanaAI/IASC
HealthDial: A No-Code LLM-Assisted Dialogue Authoring Tool for Healthcare Virtual Agents
Nouraei, Farnaz, Yong, Zhuorui, Bickmore, Timothy
We introduce HealthDial, a dialogue authoring tool that helps healthcare providers and educators create virtual agents that deliver health education and counseling to patients over multiple conversations. HealthDial leverages large language models (LLMs) to automatically create an initial session-based plan and conversations for each session using text-based patient health education materials as input. Authored dialogue is output in the form of finite state machines for virtual agent delivery so that all content can be validated and no unsafe advice is provided resulting from LLM hallucinations. LLM-drafted dialogue structure and language can be edited by the author in a no-code user interface to ensure validity and optimize clarity and impact. We conducted a feasibility and usability study with counselors and students to test our approach with an authoring task for cancer screening education. Participants used HealthDial and then tested their resulting dialogue by interacting with a 3D-animated virtual agent delivering the dialogue. Through participants' evaluations of the task experience and final dialogues, we show that HealthDial provides a promising first step for counselors to ensure full coverage of their health education materials, while creating understandable and actionable virtual agent dialogue with patients.
ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind
Han, Peixuan, Liu, Zijia, You, Jiaxuan
Large language models (LLMs) have shown promising potential in persuasion, but existing works on training LLM persuaders are still preliminary. Notably, while humans are skilled in modeling their opponent's thoughts and opinions proactively and dynamically, current LLMs struggle with such Theory of Mind (ToM) reasoning, resulting in limited diversity and opponent awareness. To address this limitation, we introduce Theory of Mind Augmented Persuader (ToMAP), a novel approach for building more flexible persuader agents by incorporating two theory of mind modules that enhance the persuader's awareness and analysis of the opponent's mental state. Specifically, we begin by prompting the persuader to consider possible objections to the target central claim, and then use a text encoder paired with a trained MLP classifier to predict the opponent's current stance on these counterclaims. Our carefully designed reinforcement learning schema enables the persuader learns how to analyze opponent-related information and utilize it to generate more effective arguments. Experiments show that the ToMAP persuader, while containing only 3B parameters, outperforms much larger baselines, like GPT-4o, with a relative gain of 39.4% across multiple persuadee models and diverse corpora. Notably, ToMAP exhibits complex reasoning chains and reduced repetition during training, which leads to more diverse and effective arguments. The opponent-aware feature of ToMAP also makes it suitable for long conversations and enables it to employ more logical and opponent-aware strategies. These results underscore our method's effectiveness and highlight its potential for developing more persuasive language agents. Code is available at: https://github.com/ulab-uiuc/ToMAP.
DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning
Diaz-Bone, Leander, Bagatella, Marco, Hübotter, Jonas, Krause, Andreas
Sparse-reward reinforcement learning (RL) can model a wide range of highly complex tasks. Solving sparse-reward tasks is RL's core premise, requiring efficient exploration coupled with long-horizon credit assignment, and overcoming these challenges is key for building self-improving agents with superhuman ability. Prior work commonly explores with the objective of solving many sparse-reward tasks, making exploration of individual high-dimensional, long-horizon tasks intractable. We argue that solving such challenging tasks requires solving simpler tasks that are relevant to the target task, i.e., whose achieval will teach the agent skills required for solving the target task. We demonstrate that this sense of direction, necessary for effective exploration, can be extracted from existing RL algorithms, without leveraging any prior information. To this end, we propose a method for directed sparse-reward goal-conditioned very long-horizon RL (DISCOVER), which selects exploratory goals in the direction of the target task. We connect DISCOVER to principled exploration in bandits, formally bounding the time until the target task becomes achievable in terms of the agent's initial distance to the target, but independent of the volume of the space of all tasks. We then perform a thorough evaluation in high-dimensional environments. We find that the directed goal selection of DISCOVER solves exploration problems that are beyond the reach of prior state-of-the-art exploration methods in RL.
The Cultural Mapping and Pattern Analysis (CMAP) Visualization Toolkit: Open Source Text Analysis for Qualitative and Computational Social Science
Abramson, Corey M., Yuhan, null, Nian, null
The CMAP (Cultural Mapping and Pattern Analysis) visualization toolkit is an open-source suite for analyzing and visualizing text data--from qualitative fieldnotes and in-depth interview transcripts to historical documents and web-scraped data such as message board posts or blogs. The toolkit is designed for scholars integrating pattern analysis, data visualization, and explanation in qualitative and/or computational social science (CSS). Despite the existence of off-the-shelf commercial qualitative data analysis software, there remains a shortage of highly scalable open-source options capable of handling large datasets and supporting advanced statistical and language modeling. The foundation of the toolkit is a pragmatic approach that aligns research tools with social science project goals--empirical explanation, theory-guided measurement, comparative design, or evidence-based recommendations--guided by the principle that research paradigms and questions should determine methods. Consequently, the CMAP visualization toolkit offers a wide range of possibilities through the adjustment of a relatively small number of parameters and allows seamless integration with other Python tools.
Co-Designing Interdisciplinary Design Projects with AI
Liow, Wei Ting, Khan, Sumbul, Ang, Lay Kee
T his work has been submitted to the IEEE for possible publication. ORCID: 0000 -0003-2811-1194 Abstract --Creating interdisciplinary design projects is time-consuming and cognitively demanding for teachers, requiring curriculum alignment, cross -subject integration, and careful sequencing. This paper presents the Interdisciplinary Design Project Planner (IDPplanner), a GPT -based planning assistant grounded in Design Innovation principles, al ignment with Singapore secondary school's syllabuses, and 21st -century competencies. In a within -subject, counterbalanced workshop with 33 in -service teachers, participants produced two versions of the same project: manual and AI -assisted, followed by self - and peer-evaluations using a six -dimensional rubric. AI -assisted version received higher scores for Curriculum Alignment, Design Thinking Application, and Coherence & Flow, with a marginal advantage for Assessment Strategies. Teacher reflections indicated that AI -assisted planning improved structure, sequencing, and idea generation, while contextualization to local syllabuses, class profiles, and student needs remained teacher-led. Contributions include (1) a purpose-built planning tool that organizes ideas into a ten - component flow with ready-to -adapt prompts, templates, and assessment suggestions; (2) an empirical, rubric -based comparison of plan ning quality; and (3) evidence that AI can function as a pedagogical planning partner . Recommendations emphasize hybrid teacher-AI workflows to enhance curriculum alignment and reduce planning complexity, and design suggestions for developers to strengthen contextual customization, iterative design support, and l ocalized rubrics. Although instantiated with a Singapore -based curriculum, the planning flow and rubric are framework -agnostic and can be parameterized for other systems. Interdisciplinary learning approaches have gained prominence globally, particularly as countries prioritize 21st-century competencies (21CC) such as creativity, problem - solving, collaboration, and adaptive thinking.
FinFlowRL: An Imitation-Reinforcement Learning Framework for Adaptive Stochastic Control in Finance
Traditional stochastic control methods in finance struggle in real world markets due to their reliance on simplifying assumptions and stylized frameworks. Such methods typically perform well in specific, well defined environments but yield suboptimal results in changed, non stationary ones. We introduce FinFlowRL, a novel framework for financial optimal stochastic control. The framework pretrains an adaptive meta policy learning from multiple expert strategies, then finetunes through reinforcement learning in the noise space to optimize the generative process. By employing action chunking generating action sequences rather than single decisions, it addresses the non Markovian nature of markets. FinFlowRL consistently outperforms individually optimized experts across diverse market conditions.
Online Policy Learning via a Self-Normalized Maximal Inequality
Girard, Samuel, Bibaut, Aurélien, Zenati, Houssam
Adaptive experiments produce dependent data that break i.i.d. assumptions that underlie classical concentration bounds and invalidate standard learning guarantees. In this paper, we develop a self-normalized maximal inequality for martingale empirical processes. Building on this, we first propose an adaptive sample-variance penalization procedure which balances empirical loss and sample variance, valid for general dependent data. Next, this allows us to derive a new variance-regularized pessimistic off-policy learning objective, for which we establish excess-risk guarantees. Subsequently, we show that, when combined with sequential updates and under standard complexity and margin conditions, the resulting estimator achieves fast convergence rates in both parametric and nonparametric regimes, improving over the usual $1/\sqrt{n}$ baseline. We complement our theoretical findings with numerical simulations that illustrate the practical gains of our approach.