AITopics

The adaptation of teaching slides to instructors' situated teaching needs, including pedagogical styles and their students' context, is a critical yet time-consuming task for educators. Through a series of educator interviews, we first identify and systematically categorize the key friction points that impede this adaptation process. Grounded in these findings, we introduce a novel multi-agent framework designed to automate slide adaptation based on high-level instructor specifications. An evaluation involving 16 modification requests across 8 real-world courses validates our approach. The framework's output consistently achieved high scores in intent alignment, content coherence and factual accuracy, and performed on par with baseline methods regarding visual clarity, while also demonstrating appropriate timeliness and a high operational agreement with human experts, achieving an F1 score of 0.89. This work heralds a new paradigm where AI agents handle the logistical burdens of instructional design, liberating educators to focus on the creative and strategic aspects of teaching.

artificial intelligence, deep learning, machine learning, (17 more...)

2511.1884

Country: Asia (0.46)

Genre:

Instructional Material > Course Syllabus & Notes (0.93)
Research Report > New Finding (0.68)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Trusov, Michael, Hwang, Minha, Jamal, Zainab, Chandra, Swarup

Strategic Decision Framework for Enterprise LLM Adoption

Organizations are rapidly adopting Large Language Models (LLMs) to transform their operations, yet they lack clear guidance on key decisions for adoption and implementation. While LLMs offer powerful capabilities in content generation, assisted coding, and process automation, businesses face critical challenges in data security, LLM solution development approach, infrastructure requirements, and deployment strategies. Healthcare providers must protect patient data while leveraging LLMs for medical analysis, financial institutions need to balance automated customer service with regulatory compliance, and software companies seek to enhance development productivity while maintaining code security. This article presents a systematic six-step decision framework for LLM adoption, helping organizations navigate from initial application selection to final deployment. Based on extensive interviews and analysis of successful and failed implementations, our framework provides practical guidance for business leaders to align technological capabilities with business objectives. Through key decision points and real-world examples from both B2B and B2C contexts, organizations can make informed decisions about LLM adoption while ensuring secure and efficient integration across various use cases, from customer service automation to content creation and advanced analytics.

large language model, machine learning, natural language, (20 more...)

2511.18589

Genre:

Research Report (0.50)
Instructional Material (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Natural Emergent Misalignment from Reward Hacking in Production RL

MacDiarmid, Monte, Wright, Benjamin, Uesato, Jonathan, Benton, Joe, Kutasov, Jon, Price, Sara, Bouscal, Naia, Bowman, Sam, Bricken, Trenton, Cloud, Alex, Denison, Carson, Gasteiger, Johannes, Greenblatt, Ryan, Leike, Jan, Lindsey, Jack, Mikulik, Vlad, Perez, Ethan, Rodrigues, Alex, Thomas, Drake, Webson, Albert, Ziegler, Daniel, Hubinger, Evan

We show that when large language models learn to reward hack on production RL environments, this can result in egregious emergent misalignment. We start with a pretrained model, impart knowledge of reward hacking strategies via synthetic document finetuning or prompting, and train on a selection of real Anthropic production coding environments. Unsurprisingly, the model learns to reward hack. Surprisingly, the model generalizes to alignment faking, cooperation with malicious actors, reasoning about malicious goals, and attempting sabotage when used with Claude Code, including in the codebase for this paper. Applying RLHF safety training using standard chat-like prompts results in aligned behavior on chat-like evaluations, but misalignment persists on agentic tasks. Three mitigations are effective: (i) preventing the model from reward hacking; (ii) increasing the diversity of RLHF safety training; and (iii) "inoculation prompting", wherein framing reward hacking as acceptable behavior during training removes misaligned generalization even when reward hacking is learned.

large language model, machine learning, natural language, (18 more...)

2511.18397

Country: Asia > Middle East (0.27)

Genre:

Research Report > New Finding (1.00)
Instructional Material (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.92)
Law (0.67)
Education > Educational Technology > Educational Software (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Developing an AI Course for Synthetic Chemistry Students

Zheng, Zhiling

Artificial intelligence (AI) and data science are transforming chemical research, yet few formal courses are tailored to synthetic and experimental chemists, who often face steep entry barriers due to limited coding experience and lack of chemistry-specific examples. We present the design and implementation of AI4CHEM, an introductory data-driven chem-istry course created for students on the synthetic chemistry track with no prior programming background. The curricu-lum emphasizes chemical context over abstract algorithms, using an accessible web-based platform to ensure zero-install machine learning (ML) workflow development practice and in-class active learning. Assessment combines code-guided homework, literature-based mini-reviews, and collaborative projects in which students build AI-assisted workflows for real experimental problems. Learning gains include increased confidence with Python, molecular property prediction, reaction optimization, and data mining, and improved skills in evaluating AI tools in chemistry. All course materials are openly available, offering a discipline-specific, beginner-accessible framework for integrating AI into synthetic chemistry training.

data mining, large language model, machine learning, (19 more...)

2511.18244

Country: North America > United States (0.28)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Materials > Chemicals (1.00)
Education > Curriculum > Subject-Specific Education (0.83)
Education > Educational Setting > Higher Education (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning

Qin, Ruoyu, He, Weiran, Huang, Weixiao, Zhang, Yangkun, Zhao, Yikai, Pang, Bo, Xu, Xinran, Shan, Yingdi, Wu, Yongwei, Zhang, Mingxing

Reinforcement Learning (RL) has become critical for advancing modern Large Language Models (LLMs), yet existing synchronous RL systems face severe performance bottlenecks. The rollout phase, which dominates end-to-end iteration time, suffers from substantial long-tail latency and poor resource utilization due to inherent workload imbalance. We present Seer, a novel online context learning system that addresses these challenges by exploiting previously overlooked similarities in output lengths and generation patterns among requests sharing the same prompt. Seer introduces three key techniques: divided rollout for dynamic load balancing, context-aware scheduling, and adaptive grouped speculative decoding. Together, these mechanisms substantially reduce long-tail latency and improve resource efficiency during rollout. Evaluations on production-grade RL workloads demonstrate that Seer improves end-to-end rollout throughput by 74% to 97% and reduces long-tail latency by 75% to 93% compared to state-of-the-art synchronous RL systems, significantly accelerating RL training iterations.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

2511.14617

Genre:

Research Report (1.00)
Instructional Material > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Baron, Ethan, Amin, Alan N., Weitzman, Ruben, Marks, Debora, Wilson, Andrew Gordon

A Diffusion Model to Shrink Proteins While Maintaining Their Function

Many proteins useful in modern medicine or bioengineering are challenging to make in the lab, fuse with other proteins in cells, or deliver to tissues in the body, because their sequences are too long. Shortening these sequences typically involves costly, time-consuming experimental campaigns. Ideally, we could instead use modern models of massive databases of sequences from nature to learn how to propose shrunken proteins that resemble sequences found in nature. Unfortunately, these models struggle to efficiently search the combinatorial space of all deletions, and are not trained with inductive biases to learn how to delete. To address this gap, we propose SCISOR, a novel discrete diffusion model that deletes letters from sequences to generate protein samples that resemble those found in nature. To do so, SCISOR trains a de-noiser to reverse a forward noising process that adds random insertions to natural sequences. As a generative model, SCISOR fits evolutionary sequence data competitively with previous large models. In evaluation, SCISOR achieves state-of-the-art predictions of the functional effects of deletions on ProteinGym. Finally, we use the SCISOR de-noiser to shrink long protein sequences, and show that its suggested deletions result in significantly more realistic proteins and more often preserve functional motifs than previous models of evolutionary sequences.

artificial intelligence, machine learning, sequence, (19 more...)

2511.0739

Genre:

Research Report (0.82)
Instructional Material (0.54)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Weihs, Adrien, Bertozzi, Andrea L., Thorpe, Matthew

Analysis of Semi-Supervised Learning on Hypergraphs

Hypergraphs provide a natural framework for modeling higher-order interactions, yet their theoretical underpinnings in semi-supervised learning remain limited. We provide an asymptotic consistency analysis of variational learning on random geometric hypergraphs, precisely characterizing the conditions ensuring the well-posedness of hypergraph learning as well as showing convergence to a weighted $p$-Laplacian equation. Motivated by this, we propose Higher-Order Hypergraph Learning (HOHL), which regularizes via powers of Laplacians from skeleton graphs for multiscale smoothness. HOHL converges to a higher-order Sobolev seminorm. Empirically, it performs strongly on standard baselines.

artificial intelligence, hypergraph, machine learning, (17 more...)

2510.25354

Country:

Europe (0.67)
North America > United States > California > Los Angeles County > Los Angeles (0.27)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.45)

Industry: Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)

Fouda, Aya E., Hassan, Abdelrahamn A., Hanafy, Radwa J., Fouda, Mohammed E.

PsychiatryBench: A Multi-Task Benchmark for LLMs in Psychiatry

Large language models (LLMs) offer significant potential in enhancing psychiatric practice, from improving diagnostic accuracy to streamlining clinical documentation and therapeutic support. However, existing evaluation resources heavily rely on small clinical interview corpora, social media posts, or synthetic dialogues, which limits their clinical validity and fails to capture the full complexity of diagnostic reasoning. In this work, we introduce PsychiatryBench, a rigorously curated benchmark grounded exclusively in authoritative, expert-validated psychiatric textbooks and casebooks. PsychiatryBench comprises eleven distinct question-answering tasks ranging from diagnostic reasoning and treatment planning to longitudinal follow-up, management planning, clinical approach, sequential case analysis, and multiple-choice/extended matching formats totaling 5,188 expert-annotated items. {\color{red}We evaluate a diverse set of frontier LLMs (including Google Gemini, DeepSeek, Sonnet 4.5, and GPT 5) alongside leading open-source medical models such as MedGemma using both conventional metrics and an "LLM-as-judge" similarity scoring framework. Our results reveal substantial gaps in clinical consistency and safety, particularly in multi-turn follow-up and management tasks, underscoring the need for specialized model tuning and more robust evaluation paradigms. PsychiatryBench offers a modular, extensible platform for benchmarking and improving LLM performance in mental health applications.

disorder, large language model, machine learning, (19 more...)

2509.09711

Country:

North America > United States (1.00)
Africa > Middle East > Egypt (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Instructional Material (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

BemaGANv2: A Tutorial and Comparative Survey of GAN-based Vocoders for Long-Term Audio Generation

Park, Taesoo, Jeong, Mungwi, Park, Mingyu, Kim, Narae, Kim, Junyoung, Kim, Mujung, Yoo, Jisang, Lee, Hoyun, Kim, Sanghoon, Kwon, Soonchul

This paper presents a tutorial-style survey and implementation guide of BemaGANv2, an advanced GANbased vocoder designed for high-fidelity and long-term audio generation. Long-term audio generation is critical for applications in Text-to-Music (TTM) and Text-to-Audio (TTA) systems, where maintaining temporal coherence, prosodic consistency, and harmonic structure over extended durations remains a significant challenge. Built upon the original BemaGAN architecture, BemaGANv2 incorporates major architectural innovations by replacing traditional ResBlocks in the generator with the Anti-aliased Multi-Periodicity composition (AMP) module, which internally applies the Snake activation function to better model periodic structures. In the discriminator framework, we integrate the Multi-Envelope Discriminator (MED), a novel architecture we proposed, to extract rich temporal envelope features crucial for periodicity detection. Coupled with the Multi-Resolution Discriminator (MRD), this combination enables more accurate modeling of long-range dependencies in audio. We systematically evaluate various discriminator configurations, including Multi-Scale Discriminator (MSD) + MED, MSD + MRD, and Multi-Period Discriminator (MPD) + MED + MRD, using objective metrics (Fréchet Audio Distance (FAD), Structural Similarity Index (SSIM), Pearson Correlation Coefficient (PCC), Mel-Cepstral Distortion (MCD)) and subjective evaluations (MOS, SMOS). This paper also provides a comprehensive tutorial on the model architecture, training methodology, and implementation to promote reproducibility. The code and pre-trained models are available at: https://github.com/dinhoitt/BemaGANv2.

artificial intelligence, discriminator, machine learning, (18 more...)

2506.09487

Country: Asia > South Korea (0.28)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (0.90)
Research Report > Experimental Study (0.68)

Industry:

Information Technology (0.68)
Health & Medicine (0.68)
Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Siegert, Ingo, Nehring, Jan, Ampudia, Aranxa Márquez, Busch, Matthias, Hillmann, Stefan

Chatbots to strengthen democracy: An interdisciplinary seminar to train identifying argumentation techniques of science denial

In recent times, discussions on social media platforms have increasingly come under scrutiny due to the proliferation of science denial and fake news. Traditional solutions, such as regulatory actions, have been implemented to mitigate the spread of misinformation; however, these measures alone are not sufficient. To complement these efforts, educational approaches are becoming essential in empowering users to critically engage with misinformation. Conversation training, through serious games or personalized methods, has emerged as a promising strategy to help users handle science denial and toxic conversation tactics. This paper suggests an interdisciplinary seminar to explore the suitability of Large Language Models (LLMs) acting as a persona of a science denier to support people in identifying misinformation and improving resilience against toxic interactions. In the seminar, groups of four to five students will develop an AI-based chatbot that enables realistic interactions with science-denial argumentation structures. The task involves planning the setting, integrating a Large Language Model to facilitate natural dialogues, implementing the chatbot using the RASA framework, and evaluating the outcomes in a user study. It is crucial that users understand what they need to do during the interaction, how to conclude it, and how the relevant information is conveyed. The seminar does not aim to develop chatbots for practicing debunking but serves to teach AI technologies and test the feasibility of this idea for future applications. The chatbot seminar is conducted as a hybrid, parallel master's module at the participating educational institutions.

artificial intelligence, large language model, natural language, (16 more...)

2511.17678

Country: Europe > United Kingdom > England (0.28)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Media > News (1.00)
Education (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.47)
Health & Medicine > Therapeutic Area > Vaccines (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)