AITopics

2512.05179

Country: Europe > Ireland (0.14)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Education > Curriculum (0.50)
Education > Educational Setting > Higher Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

arXiv.org Artificial IntelligenceDec-8-2025

Bridging Traditional Machine Learning and Large Language Models: A Two-Part Course Design for Modern AI Education

Li, Fang

This paper presents an innovative pedagogical approach for teaching artificial intelligence and data science that systematically bridges traditional machine learning techniques with modern Large Language Models (LLMs). We describe a course structured in two sequential and complementary parts: foundational machine learning concepts and contemporary LLM applications. This design enables students to develop a comprehensive understanding of AI evolution while building practical skills with both established and cutting-edge technologies. We detail the course architecture, implementation strategies, assessment methods, and learning outcomes from our summer course delivery spanning two seven-week terms. Our findings demonstrate that this integrated approach enhances student comprehension of the AI landscape and better prepares them for industry demands in the rapidly evolving field of artificial intelligence.

large language model, machine learning, natural language, (16 more...)

2512.05167

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Kim, Joochan, Jung, Minjoon, Zhang, Byoung-Tak

Exploring Ordinal Bias in Action Recognition for Instructional Videos

arXiv.org Artificial IntelligenceDec-8-2025

Action recognition models have achieved promising results in understanding instructional videos. However, they often rely on dominant, dataset-specific action sequences rather than true video comprehension, a problem that we define as ordinal bias. To address this issue, we propose two effective video manipulation methods: Action Masking, which masks frames of frequently co-occurring actions, and Sequence Shuffling, which randomizes the order of action segments. Through comprehensive experiments, we demonstrate that current models exhibit significant performance drops when confronted with nonstandard action sequences, underscoring their vulnerability to ordinal bias. Our findings emphasize the importance of rethinking evaluation strategies and developing models capable of generalizing beyond fixed action patterns in diverse instructional videos. Due to the dominant action pair'Take-Background', the model fails to predict the action'Open.' Action recognition in instructional videos has witnessed remarkable progress, primarily driven by models that excel in curated benchmark datasets (Farha & Gall, 2019; Ishikawa et al., 2021; Li et al., 2020; Yi et al., 2021).

artificial intelligence, dataset, machine learning, (18 more...)

2504.0658

Genre:

Research Report (0.84)
Instructional Material > Course Syllabus & Notes (0.66)

Industry:

Education > Educational Technology > Media (1.00)
Education > Educational Technology > Audio & Video (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.35)

Aru, Jaan, Laak, Kristjan-Julius

Developing a General Personal Tutor for Education

The vision of a universal AI tutor has remained elusive, despite decades of effort. Could LLMs be the game-changer? We overview novel issues arising from developing a nationwide AI tutor. We highlight the practical questions that point to specific gaps in our scientific understanding of the learning process.

large language model, machine learning, natural language, (15 more...)

doi: 10.1016/j.tics.2025.09.010

2512.04869

Country: Europe > Estonia (0.15)

Genre: Instructional Material (0.94)

Industry: Education > Educational Setting (0.96)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Paccagnan, Dario, Marks, Daniel, Campi, Marco C., Garatti, Simone

Pick-to-Learn for Systems and Control: Data-driven Synthesis with State-of-the-art Safety Guarantees

Data-driven methods have become paramount in modern systems and control problems characterized by growing levels of complexity . In safety-critical environments, deploying these methods requires rigorous guarantees, a need that has motivated much recent work at the interface of statistical learning and control. However, many existing approaches achieve this goal at the cost of sacrificing valuable data for testing and calibration, or by constraining the choice of learning algorithm, thus leading to suboptimal performances. In this paper, we describe Pick-to-Learn (P2L) for Systems and Control, a framework that allows any data-driven control method to be equipped with state-of-the-art safety and performance guarantees. P2L enables the use of all available data to jointly synthesize and certify the design, eliminating the need to set aside data for calibration or validation purposes. In presenting a comprehensive version of P2L for systems and control, this paper demonstrates its effectiveness across a range of core problems, including optimal control, reachability analysis, safe synthesis, and robust control. In many of these applications, P2L delivers designs and certificates that outperform commonly employed methods, and shows strong potential for broad applicability in diverse practical settings.

artificial intelligence, machine learning, probability, (18 more...)

2512.04781

Country:

Europe > United Kingdom > England (0.46)
North America > United States > California (0.28)

Genre:

Research Report (1.00)
Personal (0.67)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Energy (0.93)
Transportation (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Wang, Jingyuan, Ji, Jiahao

A Tutorial on Regression Analysis: From Linear Models to Deep Learning -- Lecture Notes on Artificial Intelligence

This article serves as the regression analysis lecture notes in the Intelligent Computing course cluster (including the courses of Artificial Intelligence, Data Mining, Machine Learning, and Pattern Recognition). It aims to provide students -- who are assumed to possess only basic university-level mathematics (i.e., with prerequisite courses in calculus, linear algebra, and probability theory) -- with a comprehensive and self-contained understanding of regression analysis without requiring any additional references. The lecture notes systematically introduce the fundamental concepts, modeling components, and theoretical foundations of regression analysis, covering linear regression, logistic regression, multinomial logistic regression, polynomial regression, basis-function models, kernel-based methods, and neural-network-based nonlinear regression. Core methodological topics include loss-function design, parameter-estimation principles, ordinary least squares, gradient-based optimization algorithms and their variants, as well as regularization techniques such as Ridge and LASSO regression. Through detailed mathematical derivations, illustrative examples, and intuitive visual explanations, the materials help students understand not only how regression models are constructed and optimized, but also how they reveal the underlying relationships between features and response variables. By bridging classical statistical modeling and modern machine-learning practice, these lecture notes aim to equip students with a solid conceptual and technical foundation for further study in advanced artificial intelligence models.

artificial intelligence, machine learning, regression, (14 more...)

2512.04747

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Agarwal, Shyam, Moghimi, Ali, Haudek, Kevin C.

AI-Enabled grading with near-domain data for scaling feedback with human-level accuracy

Constructed-response questions are crucial to encourage generative processing and test a learner's understanding of core concepts. However, the limited availability of instructor time, large class sizes, and other resource constraints pose significant challenges in providing timely and detailed evaluation, which is crucial for a holistic educational experience. In addition, providing timely and frequent assessments is challenging since manual grading is labor intensive, and automated grading is complex to generalize to every possible response scenario. This paper proposes a novel and practical approach to grade short-answer constructed-response questions. We discuss why this problem is challenging, define the nature of questions on which our method works, and finally propose a framework that instructors can use to evaluate their students' open-responses, utilizing near-domain data like data from similar questions administered in previous years. The proposed method outperforms the state of the art machine learning models as well as non-fine-tuned large language models like GPT 3.5, GPT 4, and GPT 4o by a considerable margin of over 10-20% in some cases, even after providing the LLMs with reference/model answers. Our framework does not require pre-written grading rubrics and is designed explicitly with practical classroom settings in mind. Our results also reveal exciting insights about learning from near-domain data, including what we term as accuracy and data advantages using human-labeled data, and we believe this is the first work to formalize the problem of automated short answer grading based on the near-domain data.

large language model, machine learning, natural language, (22 more...)

2512.04113

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education > Curriculum (1.00)
Education > Educational Setting > Online (0.93)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningDec-4-2025

When does Gaussian equivalence fail and how to fix it: Non-universal behavior of random features with quadratic scaling

Wen, Garrett G., Hu, Hong, Lu, Yue M., Fan, Zhou, Misiakiewicz, Theodor

A major effort in modern high-dimensional statistics has been devoted to the analysis of linear predictors trained on nonlinear feature embeddings via empirical risk minimization (ERM). Gaussian equivalence theory (GET) has emerged as a powerful universality principle in this context: it states that the behavior of high-dimensional, complex features can be captured by Gaussian surrogates, which are more amenable to analysis. Despite its remarkable successes, numerical experiments show that this equivalence can fail even for simple embeddings -- such as polynomial maps -- under general scaling regimes. We investigate this breakdown in the setting of random feature (RF) models in the quadratic scaling regime, where both the number of features and the sample size grow quadratically with the data dimension. We show that when the target function depends on a low-dimensional projection of the data, such as generalized linear models, GET yields incorrect predictions. To capture the correct asymptotics, we introduce a Conditional Gaussian Equivalent (CGE) model, which can be viewed as appending a low-dimensional non-Gaussian component to an otherwise high-dimensional Gaussian model. This hybrid model retains the tractability of the Gaussian framework and accurately describes RF models in the quadratic scaling regime. We derive sharp asymptotics for the training and test errors in this setting, which continue to agree with numerical simulations even when GET fails. Our analysis combines general results on CLT for Wiener chaos expansions and a careful two-phase Lindeberg swapping argument. Beyond RF models and quadratic scaling, our work hints at a rich landscape of universality phenomena in high-dimensional ERM.

assumption 3, nullnull null, probability, (17 more...)

arXiv.org Machine Learning

2512.03325

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
North America > United States > Montana (0.04)
Asia > Middle East > Israel (0.04)

Genre:

Research Report (0.64)
Instructional Material (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

arXiv.org Artificial IntelligenceDec-4-2025

Domain Feature Collapse: Implications for Out-of-Distribution Detection and Solutions

Yang, Hong, Kar, Devroop, Yu, Qi, Ororbia, Alex, Desell, Travis

Why do state-of-the-art OOD detection methods exhibit catastrophic failure when models are trained on single-domain datasets? We provide the first theoretical explanation for this phenomenon through the lens of information theory. We prove that supervised learning on single-domain data inevitably produces domain feature collapse -- representations where I(x_d; z) = 0, meaning domain-specific information is completely discarded. This is a fundamental consequence of information bottleneck optimization: models trained on single domains (e.g., medical images) learn to rely solely on class-specific features while discarding domain features, leading to catastrophic failure when detecting out-of-domain samples (e.g., achieving only 53% FPR@95 on MNIST). We extend our analysis using Fano's inequality to quantify partial collapse in practical scenarios. To validate our theory, we introduce Domain Bench, a benchmark of single-domain datasets, and demonstrate that preserving I(x_d; z) > 0 through domain filtering (using pretrained representations) resolves the failure mode. While domain filtering itself is conceptually straightforward, its effectiveness provides strong empirical evidence for our information-theoretic framework. Our work explains a puzzling empirical phenomenon, reveals fundamental limitations of supervised learning in narrow domains, and has broader implications for transfer learning and when to fine-tune versus freeze pretrained models.

artificial intelligence, deep learning, machine learning, (15 more...)

2512.04034

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceDec-4-2025

Deep Unfolding: Recent Developments, Theory, and Design Guidelines

Shlezinger, Nir, Segarra, Santiago, Zhang, Yi, Avrahami, Dvir, Davidov, Zohar, Routtenberg, Tirza, Eldar, Yonina C.

Optimization methods play a central role in signal processing, serving as the mathematical foundation for inference, estimation, and control. While classical iterative optimization algorithms provide interpretability and theoretical guarantees, they often rely on surrogate objectives, require careful hyperparameter tuning, and exhibit substantial computational latency. Conversely, machine learning (ML ) offers powerful data-driven modeling capabilities but lacks the structure, transparency, and efficiency needed for optimization-driven inference. Deep unfolding has recently emerged as a compelling framework that bridges these two paradigms by systematically transforming iterative optimization algorithms into structured, trainable ML architectures. This article provides a tutorial-style overview of deep unfolding, presenting a unified perspective of methodologies for converting optimization solvers into ML models and highlighting their conceptual, theoretical, and practical implications. We review the foundations of optimization for inference and for learning, introduce four representative design paradigms for deep unfolding, and discuss the distinctive training schemes that arise from their iterative nature. Furthermore, we survey recent theoretical advances that establish convergence and generalization guarantees for unfolded optimizers, and provide comparative qualitative and empirical studies illustrating their relative trade-offs in complexity, interpretability, and robustness.

artificial intelligence, iteration, machine learning, (18 more...)

2512.03768

Genre:

Research Report (1.00)
Instructional Material (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)