Goto

Collaborating Authors

 literacy


Learning to Use AI for Learning: Teaching Responsible Use of AI Chatbot to K-12 Students Through an AI Literacy Module

Xiao, Ruiwei, Hou, Xinying, Tseng, Ying-Jui, Nieu, Hsuan, Liao, Guanze, Stamper, John, Koedinger, Kenneth R.

arXiv.org Artificial Intelligence

As Artificial Intelligence (AI) becomes increasingly integrated into daily life, there is a growing need to equip the next generation with the ability to apply, interact with, evaluate, and collaborate with AI systems responsibly. Prior research highlights the urgent demand from K-12 educators to teach students the ethical and effective use of AI for learning. To address this need, we designed an Large-Language Model (LLM)-based module to teach prompting literacy. This includes scenario-based deliberate practice activities with direct interaction with intelligent LLM agents, aiming to foster secondary school students' responsible engagement with AI chatbots. We conducted two iterations of classroom deployment in 11 authentic secondary education classrooms, and evaluated 1) AI-based auto-grader's capability; 2) students' prompting performance and confidence changes towards using AI for learning; and 3) the quality of learning and assessment materials. Results indicated that the AI-based auto-grader could grade student-written prompts with satisfactory quality. In addition, the instructional materials supported students in improving their prompting skills through practice and led to positive shifts in their perceptions of using AI for learning. Furthermore, data from Study 1 informed assessment revisions in Study 2. Analyses of item difficulty and discrimination in Study 2 showed that True/False and open-ended questions could measure prompting literacy more effectively than multiple-choice questions for our target learners. These promising outcomes highlight the potential for broader deployment and highlight the need for broader studies to assess learning effectiveness and assessment design.


Generative AI Practices, Literacy, and Divides: An Empirical Analysis in the Italian Context

Savoldi, Beatrice, Attanasio, Giuseppe, Gorodetskaya, Olga, Manerba, Marta Marchiori, Bassignana, Elisa, Casola, Silvia, Negri, Matteo, Caselli, Tommaso, Bentivogli, Luisa, Ramponi, Alan, Muti, Arianna, Balbo, Nicoletta, Nozza, Debora

arXiv.org Artificial Intelligence

The rise of Artificial Intelligence (AI) language technologies, particularly generative AI (GenAI) chatbots accessible via conversational interfaces, is transforming digital interactions. While these tools hold societal promise, they also risk widening digital divides due to uneven adoption and low awareness of their limitations. This study presents the first comprehensive empirical mapping of GenAI adoption, usage patterns, and literacy in Italy, based on newly collected survey data from 1,906 Italian-speaking adults. Our findings reveal widespread adoption for both work and personal use, including sensitive tasks like emotional support and medical advice. Crucially, GenAI is supplanting other technologies to become a primary information source: this trend persists despite low user digital literacy, posing a risk as users struggle to recognize errors or misinformation. Moreover, we identify a significant gender divide -- particularly pronounced in older generations -- where women are half as likely to adopt GenAI and use it less frequently than men. While we find literacy to be a key predictor of adoption, it only partially explains this disparity, suggesting that other barriers are at play. Overall, our data provide granular insights into the multipurpose usage of GenAI, highlighting the dual need for targeted educational initiatives and further investigation into the underlying barriers to equitable participation that competence alone cannot explain.


Designing Beyond Language: Sociotechnical Barriers in AI Health Technologies for Limited English Proficiency

Huang, Michelle, Rodriguez, Violeta J., Saha, Koustuv, August, Tal

arXiv.org Artificial Intelligence

Limited English proficiency (LEP) patients in the U.S. face systemic barriers to healthcare beyond language and interpreter access, encompassing procedural and institutional constraints. AI advances may support communication and care through on-demand translation and visit preparation, but also risk exacerbating existing inequalities. We conducted storyboard-driven interviews with 14 patient navigators to explore how AI could shape care experiences for Spanish-speaking LEP individuals. We identified tensions around linguistic and cultural misunderstandings, privacy concerns, and opportunities and risks for AI to augment care workflows. Participants highlighted structural factors that can undermine trust in AI systems, including sensitive information disclosure, unstable technology access, and low digital literacy. While AI tools can potentially alleviate social barriers and institutional constraints, there are risks of misinformation and uprooting human camaraderie. Our findings contribute design considerations for AI that support LEP patients and care teams via rapport-building, education, and language support, and minimizing disruptions to existing practices.


AI Literacy Assessment Revisited: A Task-Oriented Approach Aligned with Real-world Occupations

Bogart, Christopher, Warrier, Aparna, Agarwal, Arav, Higashi, Ross, Zhang, Yufan, Flot, Jesse, Savelka, Jaromir, Burte, Heather, Sakr, Majd

arXiv.org Artificial Intelligence

As artificial intelligence (AI) systems become ubiquitous in professional contexts, there is an urgent need to equip workers, often with backgrounds outside of STEM, with the skills to use these tools effectively as well as responsibly, that is, to be AI literate. However, prevailing definitions and therefore assessments of AI literacy often emphasize foundational technical knowledge, such as programming, mathematics, and statistics, over practical knowledge such as interpreting model outputs, selecting tools, or identifying ethical concerns. This leaves a noticeable gap in assessing someone's AI literacy for real-world job use. We propose a work-task-oriented assessment model for AI literacy which is grounded in the competencies required for effective use of AI tools in professional settings. We describe the development of a novel AI literacy assessment instrument, and accompanying formative assessments, in the context of a US Navy robotics training program. The program included training in robotics and AI literacy, as well as a competition with practical tasks and a multiple choice scenario task meant to simulate use of AI in a job setting. We found that, as a measure of applied AI literacy, the competition's scenario task outperformed the tests we adopted from past research or developed ourselves. We argue that when training people for AI-related work, educators should consider evaluating them with instruments that emphasize highly contextualized practical skills rather than abstract technical knowledge, especially when preparing workers without technical backgrounds for AI-integrated roles.


Wayfinding through the AI wilderness: Mapping rhetorics of ChatGPT prompt writing on X (formerly Twitter) to promote critical AI literacies

Gupta, Anuj, Shivers-McNair, Ann

arXiv.org Artificial Intelligence

In this paper, we demonstrate how studying the rhetorics of ChatGPT prompt writing on social media can promote critical AI literacies. Prompt writing is the process of writing instructions for generative AI tools like ChatGPT to elicit desired outputs and there has been an upsurge of conversations about it on social media. To study this rhetorical activity, we build on four overlapping traditions of digital writing research in computers and composition that inform how we frame literacies, how we study social media rhetorics, how we engage iteratively and reflexively with methodologies and technologies, and how we blend computational methods with qualitative methods. Drawing on these four traditions, our paper shows our iterative research process through which we gathered and analyzed a dataset of 32,000 posts (formerly known as tweets) from X (formerly Twitter) about prompt writing posted between November 2022 to May 2023. We present five themes about these emerging AI literacy practices: (1) areas of communication impacted by prompt writing, (2) micro-literacy resources shared for prompt writing, (3) market rhetoric shaping prompt writing, (4) rhetorical characteristics of prompts, and (5) definitions of prompt writing. In discussing these themes and our methodologies, we highlight takeaways for digital writing teachers and researchers who are teaching and analyzing critical AI literacies.


Cost Analysis of Human-corrected Transcription for Predominately Oral Languages

Diarra, Yacouba, Coulibaly, Nouhoum Souleymane, Leventhal, Michael

arXiv.org Artificial Intelligence

Creating speech datasets for low-resource languages is a critical yet poorly understood challenge, particularly regarding the actual cost in human labor. This paper investigates the time and complexity required to produce high-quality annotated speech data for a subset of low-resource languages, low literacy Predomi-nately Oral Languages, focusing on Bambara, a Manding language of Mali. Through a one-month field study involving ten transcribers with native proficiency, we analyze the correction of ASR-generated transcriptions of 53 hours of Bambara voice data. We report that it takes, on average, 30 hours of human labor to accurately transcribe one hour of speech data under laboratory conditions and 36 hours under field conditions. The study provides a baseline and practical insights for a large class of languages with comparable profiles undertaking the creation of NLP resources.


Therapeutic AI and the Hidden Risks of Over-Disclosure: An Embedded AI-Literacy Framework for Mental Health Privacy

Anvari, Soraya S., Wehbe, Rina R.

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly deployed in mental health contexts, from structured therapeutic support tools to informal chat-based well-being assistants. While these systems increase accessibility, scalability, and personalization, their integration into mental health care brings privacy and safety challenges that have not been well-examined. Unlike traditional clinical interactions, LLM-mediated therapy often lacks a clear structure for what information is collected, how it is processed, and how it is stored or reused. Users without clinical guidance may over-disclose personal information, which is sometimes irrelevant to their presenting concern, due to misplaced trust, lack of awareness of data risks, or the conversational design of the system. This overexposure raises privacy concerns and also increases the potential for LLM bias, misinterpretation, and long-term data misuse. We propose a framework embedding Artificial Intelligence (AI) literacy interventions directly into mental health conversational systems, and outline a study plan to evaluate their impact on disclosure safety, trust, and user experience.


Why we should be skeptical of the hasty global push to test 15-year-olds' AI literacy in 2029

AIHub

Why we should be skeptical of the hasty global push to test 15-year-olds' AI literacy in 2029 If 2022 was the year OpenAI knocked our world off course with the launch of ChatGPT, 2025 will be remembered for the frenzied embrace of AI as the solution to everything. And, yes, this includes teaching and schoolwork. In today's breakneck AI innovation race, the Organization for Economic Co-operation and Development (OECD), along with the European Commission, have called for the development of unified AI literacy strategies in kindergarten to Grade 12 education. They have done this through an AI Literacy Framework developed with Code.org, and a range of experts in computational thinking, neuroscience, AI, educational technology and innovation -- and with "valuable insights" from the "TeachAI community ." The "TeachAI community" refers to a larger umbrella project providing web resources targeting teachers, education leaders and "solution providers" .


StyleBench: Evaluating thinking styles in Large Language Models

Guo, Junyu, Gu, Shangding, Jin, Ming, Spanos, Costas, Lavaei, Javad

arXiv.org Artificial Intelligence

The effectiveness of Large Language Models (LLMs) is heavily influenced by the reasoning strategies, or styles of thought, employed in their prompts. However, the interplay between these reasoning styles, model architecture, and task type remains poorly understood. To address this, we introduce StyleBench, a comprehensive benchmark for systematically evaluating reasoning styles across diverse tasks and models. We assess five representative reasoning styles--Chain-of-Thought (CoT), Tree-of-Thought (ToT), Algorithm-of-Thought (AoT), Sketch-of-Thought (SoT), and Chain-of-Draft (CoD)--on five reasoning tasks, using 15 open-source models from major families (LLaMA, Qwen, Mistral, Gemma, GPT -OSS, Phi, and DeepSeek) ranging from 270M to 120B parameters. Our large-scale analysis reveals that no single style is universally optimal. We demonstrate that strategy efficacy is highly contingent on both model scale and task type: search-based methods (AoT, ToT) excel in open-ended problems but require large-scale models, while concise styles (SoT, CoD) achieve radical efficiency gains on well-defined tasks. Furthermore, we identify key behavioral patterns: smaller models frequently fail to follow output instructions and default to guessing, while reasoning robustness emerges as a function of scale. Our findings offer a crucial roadmap for selecting optimal reasoning strategies based on specific constraints, We open source the benchmark in https://github.com/JamesJunyuGuo/Style_Bench. Large Language Models (LLMs) have demonstrated impressive capabilities across a diverse range of tasks, including mathematical reasoning, code generation, and complex question answering (Imani et al., 2023; Wang & Chen, 2023; Tan et al., 2023). A key insight from prior work is that their performance on challenging problems is not merely a function of scale, but is critically dependent on the methods used to guide reasoning (Huang & Y ang, 2025). This has spurred the development of sophisticated prompting techniques designed to structure the model's internal reasoning process. Notable among these are Chain-of-Thought (CoT) (Wei et al., 2022), which decomposes problems into sequential steps, and more advanced paradigms like Tree-of-Thought (ToT) (Y ao et al., 2023), which explores multiple reasoning paths in parallel, and Rea-sonflux (Y ang et al., 2025b), employing high-level templates to explore potential solutions. Performance remains highly sensitive to prompt phrasing and frequently necessitates iterative feedback to achieve robust results (Sel et al., 2023). In response, recent work has sought to automate reasoning strategy selection.


Free AI training comes to California colleges -- but at what cost?

Los Angeles Times

As artificial intelligence replaces entry-level jobs, California's universities and community colleges are offering a glimmer of hope for students: free AI training that will help them master the new technology. "You're seeing in certain coding spaces significant declines in hiring for obvious reasons," Gov. Gavin Newsom said in early August from the seventh floor of Google's San Francisco office. Flanked by leadership from California's higher education systems, he called attention to the recent layoffs at Microsoft, Google's parent company, Alphabet, and at nearby Salesforce Tower, home to the tech company that is still the city's largest private employer. Now, some of those companies -- including Google and Microsoft -- will offer a suite of AI resources free to California schools and universities. In return, the companies could gain access to millions of new users.