AITopics | Instructional Material

Collaborating Authors

Instructional Material

HARDMath2: ABenchmark for Applied Mathematics Built by Students as Part of a Graduate Class

Neural Information Processing SystemsJun-15-2026, 16:51:24 GMT

Large language models (LLMs) have shown remarkable progress in mathematical problem-solving, but evaluation has largely focused on problems that have exact analytical solutions or involve formal proofs, often overlooking approximationbased problems ubiquitous in applied science and engineering. To fill this gap, we build on prior work and present HARDMath2, a dataset of 211 original problems covering the core topics in an introductory graduate applied math class, including boundary-layer analysis, WKB methods, asymptotic solutions of nonlinear partial differential equations, and the asymptotics of oscillatory integrals. This dataset was designed and verified by the students and instructors of a core graduate applied mathematics course at Harvard. We built the dataset through a novel collaborative environment that challenges students to write and refine difficult problems consistent with the class syllabus, peer-validate solutions, test different models, and automatically check LLM-generated solutions against their own answers and numerical ground truths. Evaluation results show that leading frontier models still struggle with many of the problems in the dataset, highlighting a gap in the mathematical reasoning skills of current LLMs. Importantly, students identified strategies to create increasingly difficult problems by interacting with the models and exploiting common failure modes. This back-and-forth with the models not only resulted in a richer and more challenging benchmark but also led to qualitative improvements in the students' understanding of the course material, which is increasingly important as we enter an age where state-of-the-art language models can solve many challenging problems across a wide domain of fields.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Curriculum (0.48)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning

Neural Information Processing SystemsJun-15-2026, 13:14:33 GMT

Sparse-reward reinforcement learning (RL) can model a wide range of highly complex tasks. Solving sparse-reward tasks is RL's core premise--requiring efficient exploration coupled with long-horizon credit assignment--and overcoming these challenges is key for building self-improving agents with superhuman ability. Prior work commonly explores with the objective of solving many sparse-reward tasks, making exploration of individual high-dimensional, long-horizon tasks intractable. We argue that solving such challenging tasks requires solving simpler tasks that are relevant to the target task, i.e., whose achieval will teach the agent skills required for solving the target task. We demonstrate that this sense of direction, necessary for effective exploration, can be extracted from existing RL algorithms, without leveraging any prior information. To this end, we propose a method for directed sparse-reward goal-conditioned very long-horizon RL (DISCOVER), which selects exploratory goals in the direction of the target task. We connect DISCOVER to principled exploration in bandits, formally bounding the time until the target task becomes achievable in terms of the agent's initial distance to the target, but independent of the volume of the space of all tasks. We then perform a thorough evaluation in high-dimensional environments. We find that the directed goal selection of DISCOVER solves exploration problems that are beyond the reach of prior state-of-the-art exploration methods in RL.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America (0.28)
Europe (0.28)

Genre:

Research Report > Experimental Study (1.00)
Instructional Material > Course Syllabus & Notes (0.81)

Industry:

Leisure & Entertainment > Games (1.00)
Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Structure-free Graph Condensation: From Large-scale Graphs to Condensed Graph-free Data

Neural Information Processing SystemsJun-15-2026, 03:18:09 GMT

Graph condensation, which reduces the size of a large-scale graph by synthesizing a small-scale condensed graph as its substitution, has immediate benefits for various graph learning tasks. However, existing graph condensation methods rely on the joint optimization of nodes and structures in the condensed graph, and overlook critical issues in effectiveness and generalization ability. In this paper, we advocate a new Structure-Free Graph Condensation paradigm, named SFGC, to distill a largescale graph into a small-scale graph node set without explicit graph structures, i.e., graph-free data. Our idea is to implicitly encode topology structure information into the node attributes in the synthesized graph-free data, whose topology is reduced to an identity matrix.

data mining, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre: Instructional Material (0.46)

Industry:

Information Technology > Security & Privacy (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Personalized Exercise Recommendation with Semantically-Grounded Knowledge Tracing

Neural Information Processing SystemsJun-14-2026, 23:20:11 GMT

We introduce ExRec, a general framework for personalized exercise recommendation with semantically-grounded knowledge tracing. Our method builds on the observation that existing exercise recommendation approaches simulate student performance via knowledge tracing (KT) but they often overlook two key aspects: (a) the semantic content of questions and (b) the sequential, structured progression of student learning. To address this, our ExRec presents an end-to-end pipeline, from annotating the KCs of questions and learning their semantic representations to training KT models and optimizing several reinforcement learning (RL) methods. Moreover, we improve standard Q-learning-based continuous RL methods via a tailored model-based value estimation (MVE) approach that directly leverages the components of KT model in estimating cumulative knowledge improvement.

machine learning, natural language, reinforcement learning, (18 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material (1.00)
Workflow (0.93)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.93)
Education > Curriculum > Subject-Specific Education (0.67)
Education > Assessment & Standards (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

ALMGuard: Safety Shortcuts and Where to Find Them as Guardrails for Audio-Language Models

Neural Information Processing SystemsJun-14-2026, 16:52:18 GMT

Recent advances in Audio-Language Models (ALMs) have significantly improved multimodal understanding capabilities. However, the introduction of the audio modality also brings new and unique vulnerability vectors. Previous studies have proposed jailbreak attacks that specifically target ALMs, revealing that defenses directly transferred from traditional audio adversarial attacks or text-based Large Language Model (LLM) jailbreaks are largely ineffective against these ALM-specific threats. To address this issue, we propose ALMGuard, the first defense framework tailored to ALMs. Based on the assumption that safety-aligned shortcuts naturally exist in ALMs, we design a method to identify universal Shortcut Activation Perturbations (SAPs) that serve as triggers that activate the safety shortcuts to safeguard ALMs at inference time. To better sift out effective triggers while preserving the model's utility on benign tasks, we further propose Mel-Gradient Sparse Mask (M-GSM), which restricts perturbations to Mel-frequency bins that are sensitive to jailbreaks but insensitive to speech understanding. Both theoretical analyses and empirical results demonstrate the robustness of our method against both seen and unseen attacks. Overall, ALMGuard reduces the average success rate of advanced ALM-specific jailbreak attacks to 4.6% across four models, while maintaining comparable utility on benign benchmarks, establishing it as the new state of the art.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Asia (0.28)
North America (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)
Law Enforcement & Public Safety (0.93)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Zero-Shot Performance Prediction for Probabilistic Scaling Laws

Neural Information Processing SystemsJun-14-2026, 11:51:03 GMT

The prediction of learning curves for Natural Language Processing (NLP) models enables informed decision-making to meet specific performance objectives, while reducing computational overhead and lowering the costs associated with dataset acquisition and curation. In this work, we formulate the prediction task as a multitask learning problem, where each task's data is modelled as being organized within a two-layer hierarchy. To model the shared information and dependencies across tasks and hierarchical levels, we employ latent variable multi-output Gaussian Processes, enabling to account for task correlations and supporting zero-shot prediction of learning curves (LCs). We demonstrate that this approach facilitates the development of probabilistic scaling laws at lower costs. Applying an active learning strategy, LCs can be queried to reduce predictive uncertainty and provide predictions close to ground truth scaling laws.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.45)
Asia (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Instructional Material (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Education (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

The challenge of being neurodivergent in Japan's culture of conformity

The Japan TimesJun-12-2026, 02:40:00 GMT

As awareness grows, more Japanese adults are receiving answers to struggles that went unrecognized for years. Social camouflaging can help neurodivergent people navigate social situations, but researchers say the effort often comes with significant emotional and mental strain. The first major crisis in Yosuke's life came when he stood in front of his students. Until then, the 24-year-old had navigated his life with few obstacles. He had done well in school, scored highly on IQ tests and graduated from university without any major issues. But after securing his dream job as a geography and history teacher at a girls' high school two years ago, cracks began to show. "I couldn't read the room," says Yosuke, who recalls struggling to organize course materials and wrap up classes on time.

artificial intelligence, developmental disorder, social media, (13 more...)

The Japan Times

Country: Asia > Japan (0.94)

Genre:

Research Report > New Finding (0.34)
Instructional Material (0.34)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Autism (1.00)
Government (1.00)
Education (1.00)

Technology:

Information Technology > Communications > Social Media (0.95)
Information Technology > Artificial Intelligence (0.88)

Add feedback

Nested Learning: The Illusion of Deep Learning Architectures

Neural Information Processing SystemsJun-11-2026, 22:40:20 GMT

Over the last decades, developing more powerful neural architectures and simultaneously designing optimization algorithms to effectively train them have been the core of research efforts to enhance the capability of machine learning models. Despite the recent progresses, particularly in developing Language Models (LMs), there are fundamental challenges and unanswered questions about how such models can continually learn/memorize, self-improved, and find ''effective solutions,''. In this paper, we present a new learning paradigm, called Nested Learning (NL), that coherently represents a model with a set of nested, multi-level, and/or parallel optimization problems, each of which with its own ''context flow''. NL reveals that existing deep learning methods learns from data through \emph{compressing} their own context flow, and explain how in-context learning emerges in large models. NL suggests a path (a new dimension to deep learning) to design more expressive learning algorithms with more ''levels'', resulting in higher-order in-context learning abilities. In addition to its neuroscientifically plausible and mathematically white-box nature, we advocate for its importance by presenting three core contributions: (1) Deep Optimizers: Based on NL, we show that well-known gradient-based optimizers (e.g., Adam, SGD with Momentum, etc.) are in fact associative memory modules that aim to compress the gradients with gradient descent. Building on this insight, we present a set of more expressive optimizers with deep memory and/or more powerful learning rules; (2) Self-Modifying Titans: Taking advantage of NL's insights on learning algorithms, we present a novel sequence model that learns how to modify itself by learning its own update algorithm; and (3) Continuum Memory System: We present a new formulation for memory system that generalizes the traditional viewpoint of ``long-term/short-term memory''. Combining our self-modifying sequence model with the continuum memory system, we present a learning module, called Hope, showing promising results in language modeling, continual learning, and long-context reasoning tasks.

artificial intelligence, machine learning, proceedings, (10 more...)

Neural Information Processing Systems

Genre: Instructional Material > Course Syllabus & Notes (0.58)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

Defining Autonomy for Wellness Robots in Senior Care

IEEE Spectrum RoboticsJun-11-2026, 10:00:01 GMT

Download this complimentary White Paper today! This White Paper gives engineers, researchers, and care professionals an overview of how socially assistive wellness robots can support senior wellness, and how a framework can measure their autonomy. What you will learn about: Why the senior care crisis exceeds incremental healthcare automation. Staffing shortages, rising dementia prevalence, and limited daily wellness programming all play a part. How the seven ICAA dimensions of wellness define a distinct category of socially assistive robot, separate from companion devices, medical devices, and general-purpose humanoids. How the Care Robot Autonomy Scale (CRAS), a six-level framework modeled on a driving-automation standard, measures autonomy across four wellness dimensions. What technical capabilities, clinical evidence, and a three-phase roadmap suggest about the path from current practice toward full wellness autonomy in the early 2030s. Click 'LOOK INSIDE' to Download Now.

artificial intelligence, creativity & intelligence, scientific discovery, (16 more...)

IEEE Spectrum Robotics

Genre: Instructional Material (0.56)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.38)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.52)
Information Technology > Artificial Intelligence > Cognitive Science > Creativity & Intelligence (0.41)

Add feedback

Fire360: A Benchmark for Robust Perception and Episodic Memory in Degraded 360 Firefighting Video

Neural Information Processing SystemsJun-11-2026, 08:50:41 GMT

Modern AI systems struggle most in environments where reliability is critical - scenes with smoke, poor visibility, and structural deformation. Each year, tens of thousands of firefighters are injured on duty, often due to breakdowns in situational perception. We introduce Fire360, a benchmark for evaluating perception and reasoning in safety-critical firefighting scenarios. The dataset includes 228 360 videos from professional training sessions under diverse conditions (e.g., low light, thermal distortion), annotated with action segments, object locations, and degradation metadata. Fire360 supports five tasks: Visual Question Answering, Temporal Action Captioning, Object Localization, Safety-Critical Reasoning, and Transformed Object Retrieval (TOR). TOR tests whether models can match pristine exemplars to fire-damaged counterparts in unpaired scenes, evaluating episodic memory under irreversible visual transformations. While human experts achieve 83.5% on TOR, models like GPT-4o lag significantly, exposing failures in reasoning under degradation. By releasing Fire360 and its evaluation suite, we aim to advance models that not only see, but also remember, reason, and act under uncertainty.

artificial intelligence, natural language, proceedings, (5 more...)

Neural Information Processing Systems

Genre: Instructional Material (0.60)

Industry: Law Enforcement & Public Safety > Fire & Emergency Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.63)

Add feedback