AITopics

Country:

North America > United States (0.04)
Asia > Middle East > Oman (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsFeb-7-2026, 23:53:06 GMT

Appendix AMathematicalPreliminaries

This is denotedas x subG(σ2).

artificial intelligence, distributionn, gaussian distributionn, (16 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Neural Information Processing SystemsFeb-7-2026, 23:52:59 GMT

Higher-OrderCertificationfor RandomizedSmoothing

In this work, we propose aframework to improvethe certified safety region for these smoothed classifiers without changing the underlying smoothing scheme.

artificial intelligence, machine learning, safety region, (17 more...)

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

arXiv.org Artificial IntelligenceNov-18-2025

Variable Impedance Control for Floating-Base Supernumerary Robotic Leg in Walking Assistance

Huo, Jun, Xu, Kehan, Li, Chengyao, Cao, Yu, Zuo, Jie, Chen, Xinxing, Huang, Jian

Abstract--In human-robot systems, ensuring safety during force control in the presence of both internal and external disturbances is crucial. As a typical loosely coupled floating-base robot system, the supernumerary robotic leg (SRL) system is particularly susceptible to strong internal disturbances. T o address the challenge posed by floating base, we investigated the dynamics model of the loosely coupled SRL and designed a hybrid position/force impedance controller to fit dynamic torque input. An efficient variable impedance control (VIC) method is developed to enhance human-robot interaction, particularly in scenarios involving external force disturbances. By dynamically adjusting impedance parameters, VIC improves the dynamic switching between rigidity and flexibility, so that it can adapt to unknown environmental disturbances in different states. An efficient real-time stability guaranteed impedance parameters generating network is specifically designed for the proposed SRL, to achieve shock mitigation and high rigidity supporting. Simulations and experiments validate the system's effectiveness, demonstrating its ability to maintain smooth signal transitions in flexible states while providing strong support forces in rigid states. This approach provides a practical solution for accommodating individual gait variations in interaction, and significantly advances the safety and adaptability of human-robot systems.

artificial intelligence, impedance, impedance control, (16 more...)

doi: 10.1109/LRA.2025.3588400

2511.12184

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.75)

Nieto-Cardenas, Juliana, Kramer, Erin Joy, Kurto, Peter, Dickey, Ethan, Bejarano, Andres

Owlgorithm: Supporting Self-Regulated Learning in Competitive Programming through LLM-Driven Reflection

arXiv.org Artificial IntelligenceNov-14-2025

We present Owlgorithm, an educational platform that supports Self-Regulated Learning (SRL) in competitive programming (CP) through AI-generated reflective questions. Leveraging GPT-4o, Owlgorithm produces context-aware, metacognitive prompts tailored to individual student submissions. Integrated into a second- and third-year CP course, the system-provided reflective prompts adapted to student outcomes: guiding deeper conceptual insight for correct solutions and structured debugging for partial or failed ones. Our exploratory assessment of student ratings and TA feedback revealed both promising benefits and notable limitations. While many found the generated questions useful for reflection and debugging, concerns were raised about feedback accuracy and classroom usability. These results suggest advantages of LLM-supported reflection for novice programmers, though refinements are needed to ensure reliability and pedagogical value for advanced learners. From our experience, several key insights emerged: GenAI can effectively support structured reflection, but careful prompt design, dynamic adaptation, and usability improvements are critical to realizing their potential in education. We offer specific recommendations for educators using similar tools and outline next steps to enhance Owlgorithm's educational impact. The underlying framework may also generalize to other reflective learning contexts.

large language model, machine learning, natural language, (18 more...)

doi: 10.1145/3770762.3772662

2511.09969

Country:

North America > United States > Missouri (0.47)
North America > United States > Indiana > Tippecanoe County (0.15)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.88)

Industry: Education > Educational Setting (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

arXiv.org Artificial IntelligenceOct-31-2025

Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

Deng, Yihe, Hsu, I-Hung, Yan, Jun, Wang, Zifeng, Han, Rujun, Zhang, Gufeng, Chen, Yanfei, Wang, Wei, Pfister, Tomas, Lee, Chen-Yu

Large Language Models (LLMs) often struggle with problems that require multi-step reasoning. For small-scale open-source models, Reinforcement Learning with Verifiable Rewards (RLVR) fails when correct solutions are rarely sampled even after many attempts, while Supervised Fine-Tuning (SFT) tends to overfit long demonstrations through rigid token-by-token imitation. To address this gap, we propose Supervised Reinforcement Learning (SRL), a framework that reformulates problem solving as generating a sequence of logical "actions". SRL trains the model to generate an internal reasoning monologue before committing to each action. It provides smoother rewards based on the similarity between the model's actions and expert actions extracted from the SFT dataset in a step-wise manner. This supervision offers richer learning signals even when all rollouts are incorrect, while encouraging flexible reasoning guided by expert demonstrations. As a result, SRL enables small models to learn challenging problems previously unlearnable by SFT or RLVR. Moreover, initializing training with SRL before refining with RLVR yields the strongest overall performance. Beyond reasoning benchmarks, SRL generalizes effectively to agentic software engineering tasks, establishing it as a robust and versatile training framework for reasoning-oriented LLMs.

large language model, machine learning, reinforcement learning, (19 more...)

2510.25992

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Arora, Devansh, Kumar, Nitin, Gupta, Sukrit

Does the Skeleton-Recall Loss Really Work?

arXiv.org Artificial IntelligenceAug-18-2025

Image segmentation is an important and widely performed task in computer vision. Accomplishing effective image segmentation in diverse settings often requires custom model architectures and loss functions. A set of models that specialize in segmenting thin tubular structures are topology preservation-based loss functions. These models often utilize a pixel skeletonization process claimed to generate more precise segmentation masks of thin tubes and better capture the structures that other models often miss. One such model, Skeleton Recall Loss (SRL) proposed by Kirchhoff et al.~\cite {kirchhoff2024srl}, was stated to produce state-of-the-art results on benchmark tubular datasets. In this work, we performed a theoretical analysis of the gradients for the SRL loss. Upon comparing the performance of the proposed method on some of the tubular datasets (used in the original work, along with some additional datasets), we found that the performance of SRL-based segmentation models did not exceed traditional baseline models. By providing both a theoretical explanation and empirical evidence, this work critically evaluates the limitations of topology-based loss functions, offering valuable insights for researchers aiming to develop more effective segmentation models for complex tubular structures.

artificial intelligence, dataset, machine learning, (15 more...)

2508.11374

Country: North America > Canada (0.28)

Genre: Research Report > Experimental Study (0.72)

Industry: Health & Medicine > Diagnostic Medicine (0.69)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Neural Information Processing SystemsAug-14-2025, 05:28:44 GMT

Grounded Video Situation Recognition

This task poses several challenges in identifying, disambiguat-ing, and co-referencing entities across multiple verb-role pairs, but also faces some challenges of evaluation.

computer vision, prediction, recognition, (13 more...)

Country:

North America > United States (0.04)
Asia > Middle East > Oman (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceJun-27-2025

Steering Your Diffusion Policy with Latent Space Reinforcement Learning

Wagenmaker, Andrew, Nakamoto, Mitsuhiko, Zhang, Yunchu, Park, Seohong, Yagoub, Waleed, Nagabandi, Anusha, Gupta, Abhishek, Levine, Sergey

Robotic control policies learned from human demonstrations have achieved impressive results in many real-world applications. However, in scenarios where initial performance is not satisfactory, as is often the case in novel open-world settings, such behavioral cloning (BC)-learned policies typically require collecting additional human demonstrations to further improve their behavior -- an expensive and time-consuming process. In contrast, reinforcement learning (RL) holds the promise of enabling autonomous online policy improvement, but often falls short of achieving this due to the large number of samples it typically requires. In this work we take steps towards enabling fast autonomous adaptation of BC-trained policies via efficient real-world RL. Focusing in particular on diffusion policies -- a state-of-the-art BC methodology -- we propose diffusion steering via reinforcement learning (DSRL): adapting the BC policy by running RL over its latent-noise space. We show that DSRL is highly sample efficient, requires only black-box access to the BC policy, and enables effective real-world autonomous policy improvement. Furthermore, DSRL avoids many of the challenges associated with finetuning diffusion policies, obviating the need to modify the weights of the base policy at all. We demonstrate DSRL on simulated benchmarks, real-world robotic tasks, and for adapting pretrained generalist policies, illustrating its sample efficiency and effective performance at real-world policy improvement.

arxiv preprint arxiv, machine learning, reinforcement learning, (14 more...)

2506.15799

Genre: Research Report > New Finding (0.46)

Industry:

Transportation (0.48)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Hoppe, Heiko, Baty, Léo, Bouvier, Louis, Parmentier, Axel, Schiffer, Maximilian

Structured Reinforcement Learning for Combinatorial Decision-Making

arXiv.org Machine LearningMay-27-2025

Reinforcement learning (RL) is increasingly applied to real-world problems involving complex and structured decisions, such as routing, scheduling, and assortment planning. These settings challenge standard RL algorithms, which struggle to scale, generalize, and exploit structure in the presence of combinatorial action spaces. We propose Structured Reinforcement Learning (SRL), a novel actor-critic framework that embeds combinatorial optimization layers into the actor neural network. We enable end-to-end learning of the actor via Fenchel-Young losses and provide a geometric interpretation of SRL as a primal-dual algorithm in the dual of the moment polytope. Across six environments with exogenous and endogenous uncertainty, SRL matches or surpasses the performance of unstructured RL and imitation learning on static tasks and improves over these baselines by up to 92% on dynamic problems, with improved stability and convergence speed.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2505.19053

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)