Goto

Collaborating Authors

 Higher Education


Supervised Fine-Tuning LLMs to Behave as Pedagogical Agents in Programming Education

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly being explored in higher education, yet their effectiveness as teaching agents remains underexamined. In this paper, we present the development of GuideLM, a fine-tuned LLM designed for programming education. GuideLM has been integrated into the Debugging C Compiler (DCC), an educational C compiler that leverages LLMs to generate pedagogically sound error explanations. Previously, DCC relied on off-the-shelf OpenAI models, which, while accurate, often over-assisted students by directly providing solutions despite contrary prompting. To address this, we employed supervised fine-tuning (SFT) on a dataset of 528 student-question/teacher-answer pairs, creating two models: GuideLM and GuideLM-mini, fine-tuned on ChatGPT-4o and 4o-mini, respectively. We conducted an expert analysis of 400 responses per model, comparing their pedagogical effectiveness against base OpenAI models. Our evaluation, grounded in constructivism and cognitive load theory, assessed factors such as conceptual scaffolding, clarity, and Socratic guidance. Results indicate that GuideLM and GuideLM-mini improve pedagogical performance, with an 8% increase in Socratic guidance and a 58% improvement in economy of words compared to GPT-4o. However, this refinement comes at the cost of a slight reduction in general accuracy. While further work is needed, our findings suggest that fine-tuning LLMs with targeted datasets is a promising approach for developing models better suited to educational contexts.


Artificial Intelligence in Sports: Insights from a Quantitative Survey among Sports Students in Germany about their Perceptions, Expectations, and Concerns regarding the Use of AI Tools

arXiv.org Artificial Intelligence

Generative Artificial Intelligence (AI) tools such as ChatGPT, Copilot, or Gemini have a crucial impact on academic research and teaching. Empirical data on how students perceive the increasing influence of AI, which different types of tools they use, what they expect from them in their daily academic tasks, and their concerns regarding the use of AI in their studies are still limited. The manuscript presents findings from a quantitative survey conducted among sports students of all semesters in Germany using an online questionnaire. It explores aspects such as students' usage behavior, motivational factors, and uncertainties regarding the impact of AI tools on academia in the future. Furthermore, the social climate in sports studies is being investigated to provide a general overview of the current situation of the students in Germany. Data collection took place between August and November 2023, addressing all sports departments at German universities, with a total of 262 students participating. Our Findings indicate that students have a strong interest in using AI tools in their studies, expecting them to improve their overall academic performance, understand the complexity of scientific approaches, and save time. They express confidence that the proliferation of AI will not compromise their critical thinking skills. Moreover, students are positive about integrating more AI-related topics into the curriculum and about lecturers adopting more AI-based teaching methods. However, our findings also show that students have concerns about plagiarism, lecturer preparedness and their own skills and future skill development.


Nearly all UK undergrads use AI in their studies, according to a new report

Engadget

Apparently almost all undergraduate students are using AI now, in one way or another. A new report from the UK's Higher Education Policy Institute (HEPI) found that 92 percent of students have used generative AI tools, such as ChatGPT, for their studies. At the same time, 88 percent of these students have used it for exams. These numbers are a tremendous increase from HEPI's February 2024 report in which 66 percent and 53 percent participants relayed use, respectively. The top reasons students reported using AI include saving time, improved quality of their work and getting instant support.


UK universities warned to 'stress-test' assessments as 92% of students use AI

The Guardian

British universities have been warned to "stress-test" all assessments after new research revealed "almost all" undergraduates are using generative artificial intelligence (genAI) in their studies. A survey of 1,000 students โ€“ both domestic and international โ€“ found there had been an "explosive increase" in the use of genAI in the past 12 months. Almost nine out of 10 (88%) in the 2025 poll said they used tools such as ChatGPT for their assessments, up from 53% last year. The proportion using any AI tool surged from 66% in 2024 to 92% in 2025, meaning just 8% of students are not using AI, according to a report published by the Higher Education Policy Institute and Kortext, a digital etextbook provider. Josh Freeman, the report's author, said such dramatic changes in behaviour in just 12 months were almost unheard of, and warned: "Universities should take heed: generative AI is here to stay. "There are urgent lessons here for institutions," Freeman said. "Every assessment must be reviewed in case it can be completed easily using AI.


Cognitive networks highlight differences and similarities in the STEM mindsets of human and LLM-simulated trainees, experts and academics

arXiv.org Artificial Intelligence

Understanding attitudes towards STEM means quantifying the cognitive and emotional ways in which individuals, and potentially large language models too, conceptualise such subjects. This study uses behavioural forma mentis networks (BFMNs) to investigate the STEM-focused mindset, i.e. ways of associating and perceiving ideas, of 177 human participants and 177 artificial humans simulated by GPT-3.5. Participants were split in 3 groups - trainees, experts and academics - to compare the influence of expertise level on their mindset. The results revealed that human forma mentis networks exhibited significantly higher clustering coefficients compared to GPT-3.5, indicating that human mindsets displayed a tendency to form and close triads of conceptual associations while recollecting STEM ideas. Human experts, in particular, demonstrated robust clustering coefficients, reflecting better integration of STEM concepts into their cognitive networks. In contrast, GPT-3.5 produced sparser mindsets. Furthermore, both human and GPT mindsets framed mathematics in neutral or positive terms, differently from STEM high schoolers, researchers and other large language models sampled in other works. This research contributes to understanding how mindset structure can provide cognitive insights about memory structure and machine limitations.


Scaffolding Empathy: Training Counselors with Simulated Patients and Utterance-level Performance Visualizations

arXiv.org Artificial Intelligence

Learning therapeutic counseling involves significant role-play experience with mock patients, with current manual training methods providing only intermittent granular feedback. We seek to accelerate and optimize counselor training by providing frequent, detailed feedback to trainees as they interact with a simulated patient. Our first application domain involves training motivational interviewing skills for counselors. Motivational interviewing is a collaborative counseling style in which patients are guided to talk about changing their behavior, with empathetic counseling an essential ingredient. We developed and evaluated an LLM-powered training system that features a simulated patient and visualizations of turn-by-turn performance feedback tailored to the needs of counselors learning motivational interviewing. We conducted an evaluation study with professional and student counselors, demonstrating high usability and satisfaction with the system. We present design implications for the development of automated systems that train users in counseling skills and their generalizability to other types of social skills training.


Rapidly Built Medical Crash Cart! Lessons Learned and Impacts on High-Stakes Team Collaboration in the Emergency Room

arXiv.org Artificial Intelligence

Rapidly Built Medical Crash Cart! Lessons Learned and Impacts on High-Stakes Team Collaboration in the Emergency Room Abstract --Designing robots to support high-stakes teamwork in emergency settings presents unique challenges, including seamless integration into fast-paced environments, facilitating effective communication among team members, and adapting to rapidly changing situations. While teleoperated robots have been successfully used in high-stakes domains such as firefighting and space exploration, autonomous robots that aid high-stakes teamwork remain underexplored. T o address this gap, we conducted a rapid prototyping process to develop a series of seemingly autonomous robot designed to assist clinical teams in the Emergency Room. We transformed a standard crash cart--which stores medical equipment and emergency supplies into a medical robotic crash cart (MCCR). The MCCR was evaluated through field deployments to assess its impact on team workload and usability, identified taxonomies of failure, and refined the MCCR in collaboration with healthcare professionals. By publicly disseminating our MCCR tutorial, we hope to encourage HRI researchers to explore the design of robots for high-stakes teamwork. Teleoperated robots have become indispensable tools for action teams--highly skilled specialist teams that collaborate in short, high-pressure events, requiring improvisation in unpredictable situations [1]. For example, disaster response teams rely on teleoperated robots and drones to aid search and rescue operations [2], [3]. High-stakes military and SW A T teams use teleoperated ordnance disposal [4] and surveillance robots [5] to keep the teams safe. Surgical teams employ teleoperated robots to perform keyhole surgeries with a level of precision that would be unimaginable without these machines [6], [7]. We built three teleoperated medical crash cart robots (MCCRs). MCCR 1 delivers supplies using a hoverboard circuit. MCCR 2 delivers supplies, recommends supplies using drawer opening capabilities, and was deployed at a medical training event which revealed insights.


MAFE: Multi-Agent Fair Environments for Decision-Making Systems

arXiv.org Artificial Intelligence

Fairness constraints applied to machine learning (ML) models in static contexts have been shown to potentially produce adverse outcomes among demographic groups over time. To address this issue, emerging research focuses on creating fair solutions that persist over time. While many approaches treat this as a single-agent decision-making problem, real-world systems often consist of multiple interacting entities that influence outcomes. Explicitly modeling these entities as agents enables more flexible analysis of their interventions and the effects they have on a system's underlying dynamics. A significant challenge in conducting research on multi-agent systems is the lack of realistic environments that leverage the limited real-world data available for analysis. To address this gap, we introduce the concept of a Multi-Agent Fair Environment (MAFE) and present and analyze three MAFEs that model distinct social systems. Experimental results demonstrate the utility of our MAFEs as testbeds for developing multi-agent fair algorithms.


Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents

arXiv.org Artificial Intelligence

Conversational agents are increasingly woven into individuals' personal lives, yet users often underestimate the privacy risks involved. The moment users share information with these agents (e.g., LLMs), their private information becomes vulnerable to exposure. In this paper, we characterize the notion of contextual privacy for user interactions with LLMs. It aims to minimize privacy risks by ensuring that users (sender) disclose only information that is both relevant and necessary for achieving their intended goals when interacting with LLMs (untrusted receivers). Through a formative design user study, we observe how even "privacy-conscious" users inadvertently reveal sensitive information through indirect disclosures. Based on insights from this study, we propose a locally-deployable framework that operates between users and LLMs, and identifies and reformulates out-of-context information in user prompts. Our evaluation using examples from ShareGPT shows that lightweight models can effectively implement this framework, achieving strong gains in contextual privacy while preserving the user's intended interaction goals through different approaches to classify information relevant to the intended goals.


Black lawmakers continue push to assist descendants of slaves in California

Los Angeles Times

The California Legislative Black Caucus on Thursday proposed a package of reparations for the descendants of African Americans who were enslaved in the United States, proposals that include preferences for public university admissions and financial assistance for first-time home buyers. The package contains 15 bills in what caucus members said will be a multiyear effort to repair the generational harms and discrimination suffered by the descendants of slaves in California. In 2020, Gov. Gavin Newsom and California lawmakers formed a "first in the nation" state task force to study and propose remedies for the legacy of slavery. During the end of the legislative session last year, reform advocates were frustrated that the legislature, which was limited by a tight state budget and a high-stakes election year, passed only 10 of the 14 bills prioritized by the Legislative Black Caucus. "We are picking up where we left off last year," said Assemblymember Lori Wilson (D-Suisun City) at a press conference Thursday morning.