Instructional Material
Empa: An AI-Powered Virtual Mentor for Developing Global Collaboration Skills in HPC Education
Ashish, null, Jaiswal, Aparajita, Vhaduri, Sudip, Nerella, Niveditha, Jha, Shubham
High-performance computing (HPC) and parallel computing increasingly rely on global collaboration among diverse teams, yet traditional computing curricula inadequately prepare students for cross-cultural teamwork essential in modern computational research environments. This paper presents Empa, an AI-powered virtual mentor that integrates intercultural collaboration training into undergraduate computing education. Built using large language models and deployed through a progressive web application, Empa guides students through structured activities covering cultural dimensions, communication styles, and conflict resolution that are critical for effective multicultural teamwork. Our system addresses the growing need for culturally competent HPC professionals by helping computing students develop skills to collaborate effectively in international research teams, contribute to global computational projects, and navigate the cultural complexities inherent in distributed computing environments. Pilot preparation for deployment in computing courses demonstrates the feasibility of AI-mediated intercultural training and provides insights into scalable approaches for developing intercultural collaboration skills essential for HPC workforce development.
Embedding Generative AI into Systems Analysis and Design Curriculum: Framework, Case Study, and Cross-Campus Empirical Evidence
Systems analysis students increasingly use Generative AI, yet current pedagogy lacks systematic approaches for teaching responsible AI orchestration that fosters critical thinking whilst meeting educational outcomes. Students risk accepting AI suggestions blindly or uncritically without assessing alignment with user needs or contextual appropriateness. SAGE (Structured AI-Guided Education) addresses this gap by embedding GenAI into curriculum design, training students when to accept, modify, or reject AI contributions. Implementation with 18 student groups across four Australian universities revealed how orchestration skills develop. Most groups (84\%) moved beyond passive acceptance, showing selective judgment, yet none proactively identified gaps overlooked by both human and AI analysis, indicating a competency ceiling. Students strong at explaining decisions also performed well at integrating sources, and those with deep domain understanding consistently considered accessibility considerations. Accessibility awareness proved fragile. When writing requirements, 85\% of groups explicitly considered elderly users and cultural needs. Notably, 55\% of groups struggled identifying when AI misclassified system boundaries (what belongs inside versus outside the system), 45\% missed data management errors (how information is stored and updated), and 55\% overlooked missing exception handling. Three implications emerge for educators: (i) require students to document why they accepted, modified, or rejected each AI suggestion, making reasoning explicit; (ii) embed accessibility prompts at each development stage because awareness collapses without continuous scaffolding; and (iii) have students create their own specifications before using AI, then compare versions, and anchor to research or standards to identify gaps.
Beyond Benchmarks: Responsible AI in Education Needs Learning Sciences
Learning sciences emerged in the 1980s from roots in cognitive science and AI, as well as other social sciences such as linguistics and anthropology. LS distinctively investigates the process of learning, in contrast to research on teaching, instructional design, policies, and institutions. LS also focuses on research in authentic educational settings, which contrasts with psychology or brain science research conducted in a laboratory. Learning scientists use and develop a variety of methods and theories, leading to a large body of knowledge. The field also has blurry lines with neighboring fields such as AI in education.
The Download: how to fix a tractor, and living among conspiracy theorists
You live in a house you designed and built yourself. You rely on the sun for power, heat your home with a woodstove, and farm your own fish and vegetables. This is the life of Marcin Jakubowski, the 53-year-old founder of Open Source Ecology, an open collaborative of engineers, producers, and builders developing what they call the Global Village Construction Set (GVCS). It's a set of 50 machines--everything from a tractor to an oven to a circuit maker--that are capable of building civilization from scratch and can be reconfigured however you see fit. It's all part of his ethos that life-changing technology should be available to all, not controlled by a select few. What it's like to find yourself in the middle of a conspiracy theory Last week, we held a subscribers-only Roundtables discussion exploring how to cope in this new age of conspiracy theories.
Causal Synthetic Data Generation in Recruitment
Iommi, Andrea, Mastropietro, Antonio, Guidotti, Riccardo, Monreale, Anna, Ruggieri, Salvatore
The importance of Synthetic Data Generation (SDG) has increased significantly in domains where data quality is poor or access is limited due to privacy and regulatory constraints. One such domain is recruitment, where publicly available datasets are scarce due to the sensitive nature of information typically found in curricula vitae, such as gender, disability status, or age. This lack of accessible, representative data presents a significant obstacle to the development of fair and transparent machine learning models, particularly ranking algorithms that require large volumes of data to effectively learn how to recommend candidates. In the absence of such data, these models are prone to poor generalisation and may fail to perform reliably in real-world scenarios. Recent advances in Causal Generative Models (CGMs) offer a promising solution. CGMs enable the generation of synthetic datasets that preserve the underlying causal relationships within the data, providing greater control over fairness and interpretability in the data generation process. In this study, we present a specialised SDG method involving two CGMs: one modelling job offers and the other modelling curricula. Each model is structured according to a causal graph informed by domain expertise. We use these models to generate synthetic datasets and evaluate the fairness of candidate rankings under controlled scenarios that introduce specific biases.
RubiSCoT: A Framework for AI-Supported Academic Assessment
Frรถhlich, Thorsten, Schlippe, Tim
The evaluation of academic theses is a cornerstone of higher education, ensuring rigor and integrity. Traditional methods, though effective, are time-consuming and subject to evaluator variability. This paper presents RubiSCoT, an AI-supported framework designed to enhance thesis evaluation from proposal to final submission. Using advanced natural language processing techniques, including large language models, retrieval-augmented generation, and structured chain-of-thought prompting, RubiSCoT offers a consistent, scalable solution. The framework includes preliminary assessments, multidimensional assessments, content extraction, rubric-based scoring, and detailed reporting. We present the design and implementation of RubiSCoT, discussing its potential to optimize academic assessment processes through consistent, scalable, and transparent evaluation.
Estimating Bidirectional Causal Effects with Large Scale Online Kernel Learning
In this study, a scalable online kernel learning framework is proposed for estimating bidirectional causal effects in systems characterized by mutual dependence and heteroskedasticity. Traditional causal inference often focuses on unidirectional effects, overlooking the common bidirectional relationships in real-world phenomena. Building on heteroskedasticity-based identification, the proposed method integrates a quasi-maximum likelihood estimator for simultaneous equation models with large scale online kernel learning. It employs random Fourier feature approximations to flexibly model nonlinear conditional means and variances, while an adaptive online gradient descent algorithm ensures computational efficiency for streaming and high-dimensional data. Results from extensive simulations demonstrate that the proposed method achieves superior accuracy and stability than single equation and polynomial approximation baselines, exhibiting lower bias and root mean squared error across various data-generating processes. These results confirm that the proposed approach effectively captures complex bidirectional causal effects with near-linear computational scaling. By combining econometric identification with modern machine learning techniques, the proposed framework offers a practical, scalable, and theoretically grounded solution for large scale causal inference in natural/social science, policy making, business, and industrial applications.
Parameter-Free Online Learning via Model Selection
Dylan J. Foster, Satyen Kale, Mehryar Mohri, Karthik Sridharan
Finally, we generalize these results by providing oracle inequalities for arbitrary non-linear classes in the online supervised learning model. These results are all derived through a unified meta-algorithm scheme using a novel "multi-scale" algorithm for prediction with expert advice based on random playout, which may be of