AITopics

2507.05629

Country: Asia > Japan (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Setting (1.00)
Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Ayady, Anass El, Devanne, Maxime, Forestier, Germain, Mawas, Nour El

Failure Risk Prediction in a MOOC: A Multivariate Time Series Analysis Approach

arXiv.org Artificial IntelligenceJul-30-2025

MOOCs offer free and open access to a wide audience, but completion rates remain low, often due to a lack of personalized content. To address this issue, it is essential to predict learner performance in order to provide tailored feedback. Behavioral traces-such as clicks and events-can be analyzed as time series to anticipate learners' outcomes. This work compares multivariate time series classification methods to identify at-risk learners at different stages of the course (after 5, 10 weeks, etc.). The experimental evaluation, conducted on the Open University Learning Analytics Dataset (OULAD), focuses on three courses: two in STEM and one in SHS. Preliminary results show that the evaluated approaches are promising for predicting learner failure in MOOCs. The analysis also suggests that prediction accuracy is influenced by the amount of recorded interactions, highlighting the importance of rich and diverse behavioral data.

artificial intelligence, deep learning, machine learning, (17 more...)

2507.21118

Genre:

Research Report (1.00)
Instructional Material > Online (0.73)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Time Series Analysis (0.40)

What Can Grokking Teach Us About Learning Under Nonstationarity?

Lyle, Clare, Sokar, Gharda, Pascanu, Razvan, Gyorgy, Andras

In continual learning problems, it is often necessary to overwrite components of a neural network's learned representation in response to changes in the data stream; however, neural networks often exhibit primacy bias, whereby early training data hinders the network's ability to generalize on later tasks. While feature-learning dynamics of nonstationary learning problems are not well studied, the emergence of feature-learning dynamics is known to drive the phenomenon of grokking, wherein neural networks initially memorize their training data and only later exhibit perfect generalization. This work conjectures that the same feature-learning dynamics which facilitate generalization in grokking also underlie the ability to overwrite previous learned features as well, and methods which accelerate grokking by facilitating feature-learning dynamics are promising candidates for addressing primacy bias in non-stationary learning problems. We then propose a straightforward method to induce feature-learning dynamics as needed throughout training by increasing the effective learning rate, i.e. the ratio between parameter and update norms. We show that this approach both facilitates feature-learning and improves generalization in a variety of settings, including grokking, warm-starting neural network training, and reinforcement learning tasks. Non-stationarity is ubiquitous in real-world applications of AI systems: datasets may grow over time, correlations may appear and then disappear as trends evolve, and AI systems themselves may take an active role in the generation of their own training data. In this paper, we will propose a framework for understanding and mitigating this degradation in generalization performance which connects three previously disparate phenomena: primacy bias, grokking, and feature-learning dynamics. Primacy bias: A neural network initially trained on one task is trained on a different data distribution and/or objective, and achieves worse performance than a randomly initialized network on the new task (Achille et al., 2017; Ash & Adams, 2020; Nikishin et al., 2022). Grokking: A model suddenly closes the generalization gap as a result of (possibly prolonged) further training after it has initially achieved perfect training accuracy (memorization) with poor test-time performance (Power et al., 2022). Feature learning: a network's ability to make nontrivial changes to its learned representation (a.k.a.

artificial intelligence, learning rate, machine learning, (16 more...)

2507.20057

Country: North America > United States (0.28)

Genre:

Instructional Material (0.65)
Research Report (0.64)

Industry: Education > Focused Education > Special Education (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Debunking Optimization Myths in Federated Learning for Medical Image Classification

Lee, Youngjoon, Lee, Hyukjoon, Gong, Jinu, Cao, Yang, Kang, Joonhyuk

Federated Learning (FL) is a collaborative learning method that enables decentralized model training while preserving data privacy. Despite its promise in medical imaging, recent FL methods are often sensitive to local factors such as optimizers and learning rates, limiting their robustness in practical deployments. In this work, we revisit vanilla FL to clarify the impact of edge device configurations, benchmarking recent FL methods on colorectal pathology and blood cell classification task. We numerically show that the choice of local optimizer and learning rate has a greater effect on performance than the specific FL method. Moreover, we find that increasing local training epochs can either enhance or impair convergence, depending on the FL method. These findings indicate that appropriate edge-specific configuration is more crucial than algorithmic complexity for achieving effective FL.

artificial intelligence, fl method, machine learning, (14 more...)

2507.19822

Country:

North America > United States (0.48)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre:

Research Report (1.00)
Instructional Material > Online (0.41)
Instructional Material > Course Syllabus & Notes (0.41)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.72)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.42)

DOA: A Degeneracy Optimization Agent with Adaptive Pose Compensation Capability based on Deep Reinforcement Learning

Li, Yanbin, Xiao, Canran, He, Hongyang, Yuan, Shenghai, Ke, Zong, Yu, Jiajie, Qin, Zixiong, Zhang, Zhiguo, Chi, Wenzheng, Zhang, Wei

Particle filter-based 2D-SLAM is widely used in indoor localization tasks due to its efficiency. However, indoor environments such as long straight corridors can cause severe degeneracy problems in SLAM. In this paper, we use Proximal Policy Optimization (PPO) to train an adaptive degeneracy optimization agent (DOA) to address degeneracy problem. We propose a systematic methodology to address three critical challenges in traditional supervised learning frameworks: (1) data acquisition bottlenecks in degenerate dataset, (2) inherent quality deterioration of training samples, and (3) ambiguity in annotation protocol design. We design a specialized reward function to guide the agent in developing perception capabilities for degenerate environments. Using the output degeneracy factor as a reference weight, the agent can dynamically adjust the contribution of different sensors to pose optimization. Specifically, the observation distribution is shifted towards the motion model distribution, with the step size determined by a linear interpolation formula related to the degeneracy factor. In addition, we employ a transfer learning module to endow the agent with generalization capabilities across different environments and address the inefficiency of training in degenerate environments. Finally, we conduct ablation studies to demonstrate the rationality of our model design and the role of transfer learning. We also compare the proposed DOA with SOTA methods to prove its superior degeneracy detection and optimization capabilities across various environments.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

2507.19742

Country: Asia > China (0.28)

Genre:

Research Report (0.40)
Instructional Material (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)

Machine-Learning-Assisted Photonic Device Development: A Multiscale Approach from Theory to Characterization

Chen, Yuheng, McNeil, Alexander Montes, Park, Taehyuk, Wilson, Blake A., Iyer, Vaishnavi, Bezick, Michael, Choi, Jae-Ik, Ojha, Rohan, Mahendran, Pravin, Singh, Daksh Kumar, Chitturi, Geetika, Chen, Peigang, Do, Trang, Kildishev, Alexander V., Shalaev, Vladimir M., Moebius, Michael, Cai, Wenshan, Liu, Yongmin, Boltasseva, Alexandra

Photonic device development (PDD) has achieved remarkable success in designing and implementing new devices for controlling light across various wavelengths, scales, and applications, including telecommunications, imaging, sensing, and quantum information processing. PDD is an iterative, five-step process that consists of: i) deriving device behavior from design parameters, ii) simulating device performance, iii) finding the optimal candidate designs from simulations, iv) fabricating the optimal device, and v) measuring device performance. Classically, all these steps involve Bayesian optimization, material science, control theory, and direct physics-driven numerical methods. However, many of these techniques are computationally intractable, monetarily costly, or difficult to implement at scale. In addition, PDD suffers from large optimization landscapes, uncertainties in structural or optical characterization, and difficulties in implementing robust fabrication processes. However, the advent of machine learning over the past decade has provided novel, data-driven strategies for tackling these challenges, including surrogate estimators for speeding up computations, generative modeling for noisy measurement modeling and data augmentation, reinforcement learning for fabrication, and active learning for experimental physical discovery. In this review, we present a comprehensive perspective on these methods to enable machine-learning-assisted PDD (ML-PDD) for efficient design optimization with powerful generative models, fast simulation and characterization modeling under noisy measurements, and reinforcement learning for fabrication. This review will provide researchers from diverse backgrounds with valuable insights into this emerging topic, fostering interdisciplinary efforts to accelerate the development of complex photonic devices and systems.

inverse design, machine learning, reinforcement learning, (19 more...)

doi: 10.1515/nanoph-2025-0049

2506.20056

Country:

North America > United States > Massachusetts (0.46)
North America > United States > California (0.27)

Genre:

Workflow (1.00)
Research Report (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Health & Medicine (1.00)
Education (1.00)
Government > Regional Government > North America Government > United States Government (0.92)
Energy > Power Industry (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(3 more...)

Nóvoa, Andrea, Magri, Luca

Online model learning with data-assimilated reservoir computers

We propose an online learning framework for forecasting nonlinear spatio-temporal signals (fields). The method integrates (i) dimensionality reduction, here, a simple proper orthogonal decomposition (POD) projection; (ii) a generalized autoregressive model to forecast reduced dynamics, here, a reservoir computer; (iii) online adaptation to update the reservoir computer (the model), here, ensemble sequential data assimilation. We demonstrate the framework on a wake past a cylinder governed by the Navier-Stokes equations, exploring the assimilation of full flow fields (projected onto POD modes) and sparse sensors. Three scenarios are examined: a naïve physical state estimation; a two-fold estimation of physical and reservoir states; and a three-fold estimation that also adjusts the model parameters. The two-fold strategy significantly improves ensemble convergence and reduces reconstruction error compared to the naïve approach. The three-fold approach enables robust online training of partially-trained reservoir computers, overcoming limitations of a priori training. By unifying data-driven reduced order modelling with Bayesian data assimilation, this work opens new opportunities for scalable online model learning for nonlinear time series forecasting.

artificial intelligence, estimation, machine learning, (15 more...)

doi: 10.1007/978-3-031-97567-7_5

2504.16767

Country:

Europe > United Kingdom (0.15)
Europe > Italy (0.14)

Genre:

Instructional Material > Online (0.62)
Research Report (0.40)

Industry: Education > Educational Setting > Online (0.91)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Understanding Learner-LLM Chatbot Interactions and the Impact of Prompting Guidelines

Koyuturk, Cansu, Theophilou, Emily, Patania, Sabrina, Donabauer, Gregor, Martinenghi, Andrea, Antico, Chiara, Telari, Alessia, Testa, Alessia, Bursic, Sathya, Garzotto, Franca, Hernandez-Leo, Davinia, Kruschwitz, Udo, Taibi, Davide, Amenta, Simona, Ruskov, Martin, Ognibene, Dimitri

Large Language Models (LLMs) have transformed human-computer interaction by enabling natural language-based communication with AI-powered chatbots. These models are designed to be intuitive and user-friendly, allowing users to articulate requests with minimal effort. However, despite their accessibility, studies reveal that users often struggle with effective prompting, resulting in inefficient responses. Existing research has highlighted both the limitations of LLMs in interpreting vague or poorly structured prompts and the difficulties users face in crafting precise queries. This study investigates learner-AI interactions through an educational experiment in which participants receive structured guidance on effective prompting. We introduce and compare three types of prompting guidelines: a task-specific framework developed through a structured methodology and two baseline approaches. To assess user behavior and prompting efficacy, we analyze a dataset of 642 interactions from 107 users. Using Von NeuMidas, an extended pragmatic annotation schema for LLM interaction analysis, we categorize common prompting errors and identify recurring behavioral patterns. We then evaluate the impact of different guidelines by examining changes in user behavior, adherence to prompting strategies, and the overall quality of AI-generated responses. Our findings provide a deeper understanding of how users engage with LLMs and the role of structured prompting guidance in enhancing AI-assisted communication. By comparing different instructional frameworks, we offer insights into more effective approaches for improving user competency in AI interactions, with implications for AI literacy, chatbot usability, and the design of more responsive AI systems.

large language model, machine learning, natural language, (20 more...)

doi: 10.1007/978-3-031-98417-4_26

2504.0784

Country: Europe > Italy (0.69)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material (1.00)

Industry: Education > Educational Setting > K-12 Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

arXiv.org Artificial IntelligenceJul-28-2025

Linearly Convergent Algorithms for Nonsmooth Problems with Unknown Smooth Pieces

Zhang, Zhe, Sra, Suvrit

We develop efficient algorithms for optimizing piecewise smooth (PWS) functions where the underlying partition of the domain into smooth pieces is \emph{unknown}. For PWS functions satisfying a quadratic growth (QG) condition, we propose a bundle-level (BL) type method that achieves global linear convergence -- to our knowledge, the first such result for any algorithm for this problem class. We extend this method to handle approximately PWS functions and to solve weakly-convex PWS problems, improving the state-of-the-art complexity to match the benchmark for smooth non-convex optimization. Furthermore, we introduce the first verifiable and accurate termination criterion for PWS optimization. Similar to the gradient norm in smooth optimization, this certificate tightly characterizes the optimality gap under the QG condition, and can moreover be evaluated without knowledge of any problem parameters. We develop a search subroutine for this certificate and embed it within a guess-and-check framework, resulting in an almost parameter-free algorithm for both the convex QG and weakly-convex settings.

artificial intelligence, certificate, machine learning, (17 more...)

2507.19465

Country: Europe (0.45)

Genre:

Research Report (0.64)
Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Abouelazm, Ahmed, Ratz, Johannes, Schörner, Philip, Zöllner, J. Marius

Diverse and Adaptive Behavior Curriculum for Autonomous Driving: A Student-Teacher Framework with Multi-Agent RL

arXiv.org Artificial IntelligenceJul-28-2025

Autonomous driving faces challenges in navigating complex real-world traffic, requiring safe handling of both common and critical scenarios. Reinforcement learning (RL), a prominent method in end-to-end driving, enables agents to learn through trial and error in simulation. However, RL training often relies on rule-based traffic scenarios, limiting generalization. Additionally, current scenario generation methods focus heavily on critical scenarios, neglecting a balance with routine driving behaviors. Curriculum learning, which progressively trains agents on increasingly complex tasks, is a promising approach to improving the robustness and coverage of RL driving policies. However, existing research mainly emphasizes manually designed curricula, focusing on scenery and actor placement rather than traffic behavior dynamics. This work introduces a novel student-teacher framework for automatic curriculum learning. The teacher, a graph-based multi-agent RL component, adaptively generates traffic behaviors across diverse difficulty levels. An adaptive mechanism adjusts task difficulty based on student performance, ensuring exposure to behaviors ranging from common to critical. The student, though exchangeable, is realized as a deep RL agent with partial observability, reflecting real-world perception constraints. Results demonstrate the teacher's ability to generate diverse traffic behaviors. The student, trained with automatic curricula, outperformed agents trained on rule-based traffic, achieving higher rewards and exhibiting balanced, assertive driving.

artificial intelligence, machine learning, student, (16 more...)

2507.19146

Country: Europe > Germany (0.28)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.40)

Industry:

Education (1.00)
Transportation > Ground > Road (0.62)
Information Technology > Robotics & Automation (0.62)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)