AITopics

The GuardianDec-6-2025, 15:00:55 GMT

Artificial intelligence research has a slop problem, academics say: 'It's a mess'

The author, Kevin Zhu, now runs Algoverse, an AI research and mentoring company for high schoolers. The author, Kevin Zhu, now runs Algoverse, an AI research and mentoring company for high schoolers. Artificial intelligence research has a slop problem, academics say: 'It's a mess' AI research in question as author claims to have written over 100 papers on AI that one expert calls a'disaster' A single person claims to have authored 113 academic papers on artificial intelligence this year, 89 of which will be presented this week at one of the world's leading conference on AI and machine learning, which has raised questions among computer scientists about the state of AI research. Zhu himself graduated from high school in 2018. Papers he has put out in the past two years cover subjects like using AI to locate nomadic pastoralists in sub-Saharan Africa, to evaluate skin lesions, and to translate Indonesian dialects.

artificial intelligence, farid, social media, (16 more...)

The Guardian

Country:

Africa > Sub-Saharan Africa (0.25)
Oceania > Australia (0.05)
North America > United States > Virginia (0.05)
(3 more...)

Industry:

Leisure & Entertainment > Sports (0.70)
Education > Educational Setting > K-12 Education > Secondary School (0.35)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media (0.98)

AIHubDec-5-2025, 13:34:20 GMT

We asked teachers about their experiences with AI in the classroom -- here's what they said

We asked teachers about their experiences with AI in the classroom -- here's what they said Since ChatGPT and other large language models burst into public consciousness, school boards are drafting policies, universities are hosting symposiums and tech companies are relentlessly promoting their latest AI-powered learning tools . In the race to modernize education, artificial intelligence (AI) has become the new darling of policy innovation. While AI promises efficiency and personalization, it also introduces complexity, ethical dilemmas and new demands . Teachers, who are at the heart of learning along with students, are watching this transformation with growing unease. For example, according to the Alberta Teachers' Association, 80 to 90 per cent of educators surveyed expressed concern about AI's potential negative effects on education.

classroom, machine learning, natural language, (19 more...)

AIHub

Country:

North America > Canada > Ontario (0.15)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.05)
North America > Canada > Saskatchewan (0.05)
(2 more...)

Industry: Education > Educational Setting (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

arXiv.org Machine LearningDec-5-2025

Learning Causality for Longitudinal Data

Bouchattaoui, Mouad EL

This thesis develops methods for causal inference and causal representation learning (CRL) in high-dimensional, time-varying data. The first contribution introduces the Causal Dynamic Variational Autoencoder (CDVAE), a model for estimating Individual Treatment Effects (ITEs) by capturing unobserved heterogeneity in treatment response driven by latent risk factors that affect only outcomes. CDVAE comes with theoretical guarantees on valid latent adjustment and generalization bounds for ITE error. Experiments on synthetic and real datasets show that CDVAE outperforms baselines, and that state-of-the-art models greatly improve when augmented with its latent substitutes, approaching oracle performance without access to true adjustment variables. The second contribution proposes an efficient framework for long-term counterfactual regression based on RNNs enhanced with Contrastive Predictive Coding (CPC) and InfoMax. It captures long-range dependencies under time-varying confounding while avoiding the computational cost of transformers, achieving state-of-the-art results and introducing CPC into causal inference. The third contribution advances CRL by addressing how latent causes manifest in observed variables. We introduce a model-agnostic interpretability layer based on the geometry of the decoder Jacobian. A sparse self-expression prior induces modular, possibly overlapping groups of observed features aligned with shared latent influences. We provide recovery guarantees in both disjoint and overlapping settings and show that meaningful latent-to-observed structure can be recovered without anchor features or single-parent assumptions. Scalable Jacobian-based regularization techniques are also developed.

artificial intelligence organization, lower-dimensional representation, unit treatment value assumption, (15 more...)

arXiv.org Machine Learning

2512.0498

Country:

North America > United States > California > San Francisco County > San Francisco (0.13)
Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
(2 more...)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Epidemiology (1.00)
(3 more...)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
(8 more...)

Mitra, Purbesh, Ulukus, Sennur

Semantic Soft Bootstrapping: Long Context Reasoning in LLMs without Reinforcement Learning

Long context reasoning in large language models (LLMs) has demonstrated enhancement of their cognitive capabilities via chain-of-thought (CoT) inference. Training such models is usually done via reinforcement learning with verifiable rewards (RLVR) in reasoning based problems, like math and programming. However, RLVR is limited by several bottlenecks, such as, lack of dense reward, and inadequate sample efficiency. As a result, it requires significant compute resources in post-training phase. To overcome these limitations, in this work, we propose \textbf{Semantic Soft Bootstrapping (SSB)}, a self-distillation technique, in which the same base language model plays the role of both teacher and student, but receives different semantic contexts about the correctness of its outcome at training time. The model is first prompted with a math problem and several rollouts are generated. From them, the correct and most common incorrect response are filtered, and then provided to the model in context to produce a more robust, step-by-step explanation with a verified final answer. This pipeline automatically curates a paired teacher-student training set from raw problem-answer data, without any human intervention. This generation process also produces a sequence of logits, which is what the student model tries to match in the training phase just from the bare question alone. In our experiment, Qwen2.5-3B-Instruct on GSM8K dataset via parameter-efficient fine-tuning. We then tested its accuracy on MATH500, and AIME2024 benchmarks. Our experiments show a jump of 10.6%, and 10% improvements in accuracy, respectively, over group relative policy optimization (GRPO), which is a commonly used RLVR algorithm. Our code is available at https://github.com/purbeshmitra/semantic-soft-bootstrapping, and the model, curated dataset is available at https://huggingface.co/purbeshmitra/semantic-soft-bootstrapping.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

2512.05105

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Technology > Educational Software (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Aru, Jaan, Laak, Kristjan-Julius

Developing a General Personal Tutor for Education

The vision of a universal AI tutor has remained elusive, despite decades of effort. Could LLMs be the game-changer? We overview novel issues arising from developing a nationwide AI tutor. We highlight the practical questions that point to specific gaps in our scientific understanding of the learning process.

large language model, machine learning, natural language, (15 more...)

doi: 10.1016/j.tics.2025.09.010

2512.04869

Country: Europe > Estonia (0.15)

Genre: Instructional Material (0.94)

Industry: Education > Educational Setting (0.96)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Yamaguchi, Atsuki, Morishita, Terufumi, Villavicencio, Aline, Aletras, Nikolaos

Mitigating Catastrophic Forgetting in Target Language Adaptation of LLMs via Source-Shielded Updates

Expanding the linguistic diversity of instruct large language models (LLMs) is crucial for global accessibility but is often hindered by the reliance on costly specialized target language labeled data and catastrophic forgetting during adaptation. We tackle this challenge under a realistic, low-resource constraint: adapting instruct LLMs using only unlabeled target language data. We introduce Source-Shielded Updates (SSU), a selective parameter update strategy that proactively preserves source knowledge. Using a small set of source data and a parameter importance scoring method, SSU identifies parameters critical to maintaining source abilities. It then applies a column-wise freezing strategy to protect these parameters before adaptation. Experiments across five typologically diverse languages and 7B and 13B models demonstrate that SSU successfully mitigates catastrophic forgetting. It reduces performance degradation on monolingual source tasks to just 3.4% (7B) and 2.8% (13B) on average, a stark contrast to the 20.3% and 22.3% from full fine-tuning. SSU also achieves target-language performance highly competitive with full fine-tuning, outperforming it on all benchmarks for 7B models and the majority for 13B models.

computational linguistic, large language model, machine learning, (18 more...)

2512.04844

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.93)

Genre: Research Report > New Finding (0.92)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Singh, Pooja, Kumar, Sandeep

AdiBhashaa: A Community-Curated Benchmark for Machine Translation into Indian Tribal Languages

Large language models and multilingual machine translation (MT) systems increasingly drive access to information, yet many languages of the tribal communities remain effectively invisible in these technologies. This invisibility exacerbates existing structural inequities in education, governance, and digital participation. We present AdiBhashaa, a community-driven initiative that constructs the first open parallel corpora and baseline MT systems for four major Indian tribal languages-Bhili, Mundari, Gondi, and Santali. This work combines participatory data creation with native speakers, human-in-the-loop validation, and systematic evaluation of both encoder-decoder MT models and large language models. In addition to reporting technical findings, we articulate how AdiBhashaa illustrates a possible model for more equitable AI research: it centers local expertise, builds capacity among early-career researchers from marginalized communities, and foregrounds human validation in the development of language technologies.

artificial intelligence, machine translation, natural language, (13 more...)

2512.04765

Country: Asia > India > NCT (0.15)

Genre: Research Report (0.40)

Industry: Education (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Pesenti, Dario, Bogani, Alessandro, Tentori, Katya, Teso, Stefano

Human Cognitive Biases in Explanation-Based Interaction: The Case of Within and Between Session Order Effect

Explanatory Interactive Learning (XIL) is a powerful interactive learning framework designed to enable users to customize and correct AI models by interacting with their explanations. In a nutshell, XIL algorithms select a number of items on which an AI model made a decision (e.g. images and their tags) and present them to users, together with corresponding explanations (e.g. image regions that drive the model's decision). Then, users supply corrective feedback for the explanations, which the algorithm uses to improve the model. Despite showing promise in debugging tasks, recent studies have raised concerns that explanatory interaction may trigger order effects, a well-known cognitive bias in which the sequence of presented items influences users' trust and, critically, the quality of their feedback. We argue that these studies are not entirely conclusive, as the experimental designs and tasks employed differ substantially from common XIL use cases, complicating interpretation. To clarify the interplay between order effects and explanatory interaction, we ran two larger-scale user studies (n = 713 total) designed to mimic common XIL tasks. Specifically, we assessed order effects both within and between debugging sessions by manipulating the order in which correct and wrong explanations are presented to participants. Order effects had a limited, through significant impact on users' agreement with the model (i.e., a behavioral measure of their trust), and only when examined withing debugging sessions, not between them. The quality of users' feedback was generally satisfactory, with order effects exerting only a small and inconsistent influence in both experiments. Overall, our findings suggest that order effects do not pose a significant issue for the successful employment of XIL approaches. More broadly, our work contributes to the ongoing efforts for understanding human factors in AI.

machine learning, natural language, simulation of human behavior, (20 more...)

2512.04764

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)

Industry: Education > Educational Setting > Online (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.70)

MemLoRA: Distilling Expert Adapters for On-Device Memory Systems

Bini, Massimo, Bohdal, Ondrej, Michieli, Umberto, Akata, Zeynep, Ozay, Mete, Ceritli, Taha

Memory-augmented Large Language Models (LLMs) have demonstrated remarkable consistency during prolonged dialogues by storing relevant memories and incorporating them as context. Such memory-based personalization is also key in on-device settings that allow users to keep their conversations and data private. However, memory-augmented systems typically rely on LLMs that are too costly for local on-device deployment. Even though Small Language Models (SLMs) are more suitable for on-device inference than LLMs, they cannot achieve sufficient performance. Additionally, these LLM-based systems lack native visual capabilities, limiting their applicability in multimodal contexts. In this paper, we introduce (i) MemLoRA, a novel memory system that enables local deployment by equipping SLMs with specialized memory adapters, and (ii) its vision extension MemLoRA-V, which integrates small Vision-Language Models (SVLMs) to memory systems, enabling native visual understanding. Following knowledge distillation principles, each adapter is trained separately for specific memory operations$\unicode{x2013}$knowledge extraction, memory update, and memory-augmented generation. Equipped with memory adapters, small models enable accurate on-device memory operations without cloud dependency. On text-only operations, MemLoRA outperforms 10$\times$ larger baseline models (e.g., Gemma2-27B) and achieves performance comparable to 60$\times$ larger models (e.g., GPT-OSS-120B) on the LoCoMo benchmark. To evaluate visual understanding operations instead, we extend LoCoMo with challenging Visual Question Answering tasks that require direct visual reasoning. On this, our VLM-integrated MemLoRA-V shows massive improvements over caption-based approaches (81.3 vs. 23.7 accuracy) while keeping strong performance in text-based tasks, demonstrating the efficacy of our method in multimodal contexts.

large language model, machine learning, natural language, (20 more...)

2512.04763

Country:

Europe (0.46)
Asia (0.46)
North America > United States (0.28)

Genre:

Research Report (1.00)
Overview (0.67)

Industry:

Education (0.69)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)