AITopics | taxnodes:Technology: Instructional Materials

Collaborating Authors

taxnodes:Technology: Instructional Materials

News Overviews Instructional Materials AI-Alerts Classics

On the Equivalence between Online and Private Learnability beyond Binary Classification

Neural Information Processing SystemsMay-31-2025, 17:47:07 GMT

Alon et al. [4] and Bun et al. [10] recently showed that online learnability and private PAC learnability are equivalent in binary classification. We investigate whether this equivalence extends to multi-class classification and regression. First, we show that private learnability implies online learnability in both settings. Our extension involves studying a novel variant of the Littlestone dimension that depends on a tolerance parameter and on an appropriate generalization of the concept of threshold functions beyond binary classification. Second, we show that while online learnability continues to imply private learnability in multi-class classification, current proof techniques encounter significant hurdles in the regression setting. While the equivalence for regression remains open, we provide non-trivial sufficient conditions for an online learnable class to also be privately learnable.

artificial intelligence, learnability, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Instructional Material > Online (0.35)

Industry:

Information Technology > Security & Privacy (0.93)
Education > Educational Setting > Online (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Causal Imitation for Markov Decision Processes: a Partial Identification Approach

Neural Information Processing SystemsMay-31-2025, 17:13:00 GMT

Imitation learning enables an agent to learn from expert demonstrations when the performance measure is unknown and the reward signal is not specified. Standard imitation methods do not generally apply when the learner and the expert's sensory capabilities mismatch and demonstrations are contaminated with unobserved confounding bias. To address these challenges, recent advancements in causal imitation learning have been pursued. However, these methods often require access to underlying causal structures that might not always be available, posing practical challenges. In this paper, we investigate robust imitation learning within the framework of canonical Markov Decision Processes (MDPs) using partial identification, allowing the agent to achieve expert performance even when the system dynamics are not uniquely determined from the confounded expert demonstrations. Specifically, first, we theoretically demonstrate that when unobserved confounders (UCs) exist in an MDP, the learner is generally unable to imitate expert performance. We then explore imitation learning in partially identifiable settings -- either transition distribution or reward function is non-identifiable from the available data and knowledge. Augmenting the celebrated GAIL method (Ho & Ermon, 2016), our analysis leads to two novel causal imitation algorithms that can obtain effective policies guaranteed to achieve expert performance.

imitator, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre:

Research Report > Experimental Study (1.00)
Instructional Material > Course Syllabus & Notes (0.68)

Industry:

Information Technology (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.60)

Add feedback

AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers Jake Grigsby Justin Sasek Daniel Adebi

Neural Information Processing SystemsMay-31-2025, 17:07:09 GMT

Language models trained on diverse datasets unlock generalization by in-context learning. Reinforcement Learning (RL) policies can achieve a similar effect by meta-learning within the memory of a sequence model. However, meta-RL research primarily focuses on adapting to minor variations of a single task. It is difficult to scale towards more general behavior without confronting challenges in multi-task optimization, and few solutions are compatible with meta-RL's goal of learning from large training sets of unlabeled tasks. To address this challenge, we revisit the idea that multi-task RL is bottlenecked by imbalanced training losses created by uneven return scales across different tasks. We build upon recent advancements in Transformer-based (in-context) meta-RL and evaluate a simple yet scalable solution where both an agent's actor and critic objectives are converted to classification terms that decouple optimization from the current scale of returns. Large-scale comparisons in Meta-World ML45, Multi-Game Procgen, Multi-Task POPGym, Multi-Game Atari, and BabyAI find that this design unlocks significant progress in online multi-task adaptation and memory problems without explicit task labels.

large language model, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)
Instructional Material (0.67)

Industry:

Leisure & Entertainment > Games (1.00)
Education (1.00)
Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Add feedback

Online Fast Adaptation and Knowledge Accumulation (OSAKA): a New Approach to Continual Learning

Neural Information Processing SystemsMay-31-2025, 17:02:33 GMT

Continual learning agents experience a stream of (related) tasks. The main challenge is that the agent must not forget previous tasks and also adapt to novel tasks in the stream. We are interested in the intersection of two recent continual-learning scenarios. In meta-continual learning, the model is pre-trained using meta-learning to minimize catastrophic forgetting of previous tasks. In continual-meta learning, the aim is to train agents for faster remembering of previous tasks through adaptation.

artificial intelligence, deep learning, machine learning, (13 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.45)
North America > Canada > Quebec (0.28)

Genre:

Instructional Material (0.46)
Research Report > New Finding (0.46)

Industry:

Education (0.93)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)

Add feedback

GAN Memory with No Forgetting

Neural Information Processing SystemsMay-31-2025, 16:49:16 GMT

As a fundamental issue in lifelong learning, catastrophic forgetting is directly caused by inaccessible historical data; accordingly, if the data (information) were memorized perfectly, no forgetting should be expected. Motivated by that, we propose a GAN memory for lifelong learning, which is capable of remembering a stream of datasets via generative processes, with no forgetting. Our GAN memory is based on recognizing that one can modulate the "style" of a GAN model to form perceptually-distant targeted generation. Accordingly, we propose to do sequential style modulations atop a well-behaved base GAN model, to form sequential targeted generative models, while simultaneously benefiting from the transferred base knowledge. The GAN memory - that is motivated by lifelong learning - is therefore itself manifested by a form of lifelong learning, via forward transfer and modulation of information from prior tasks. Experiments demonstrate the superiority of our method over existing approaches and its effectiveness in alleviating catastrophic forgetting for lifelong classification problems.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Instructional Material (0.96)

Industry:

Education > Educational Setting (1.00)
Health & Medicine > Therapeutic Area (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Additional details for experiment presented in Section 3 Motivation We trained each agent i with online Q-learning [33] on the Q

Neural Information Processing SystemsMay-31-2025, 13:38:28 GMT

The Boltzmann temperature is fixed to 1 and we set the learning rate to 0.05 and the discount factor to 0.99. After each learning episode we evaluate the current greedy policy on 10 episodes and report the mean return. Curves are averaged over 20 seeds and the shaded area represents the standard error. SPREAD (Figure 4a): In this environment, there are 3 agents (small orange circles) and 3 landmarks (bigger gray circles). To maximize their return, agents must therefore spread out and cover all landmarks.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Genre: Instructional Material > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.86)

Add feedback

Optimal Stochastic and Online Learning with Individual Iterates

Yunwen Lei, Peng Yang, Ke Tang, Ding-Xuan Zhou

Neural Information Processing SystemsMay-31-2025, 12:51:59 GMT

Stochastic composite mirror descent (SCMD) is a simple and efficient method able to capture both geometric and composite structures of optimization problems in machine learning. Existing strategies require to take either an average or a random selection of iterates to achieve optimal convergence rates, which, however, can either destroy the sparsity of solutions or slow down the practical training speed. In this paper, we propose a theoretically sound strategy to select an individual iterate of the vanilla SCMD, which is able to achieve optimal rates for both convex and strongly convex problems in a non-smooth learning setting. This strategy of outputting an individual iterate can preserve the sparsity of solutions which is crucial for a proper interpretation in sparse learning problems. We report experimental comparisons with several baseline methods to show the effectiveness of our method in achieving a fast training speed as well as in outputting sparse solutions.

artificial intelligence, iterate, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.14)
Asia > China > Guangdong Province (0.14)

Genre: Instructional Material > Course Syllabus & Notes (0.34)

Industry: Education > Educational Setting > Online (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Supplementary Material: Calibrating CNNs for Lifelong Learning

Neural Information Processing SystemsMay-31-2025, 12:47:38 GMT

ResNet-18(1/3) is simply ResNet-18 [1], with the number of filters in each layer reduced by 3 times [2]. We use SGD optimizer in all our experiments. In all cases, we run experiments for 5 random task orders and report the average accuracy. From the results, we can see that even with ResNet-18(1/3), which has lesser parameters than ResNet-18, results are comparable for CCLL<1,1> model. CCLL<4,1> with ResNet-18(1/3) performs even better as compared to CCLL<1,1> with ResNet-18.

artificial intelligence, machine learning, resnet-18, (14 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.17)
Asia > India (0.17)

Genre: Instructional Material (0.44)

Industry: Education > Educational Setting > Continuing Education (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Calibrating CNNs for Lifelong Learning

Neural Information Processing SystemsMay-31-2025, 12:47:31 GMT

We present an approach for lifelong/continual learning of convolutional neural networks (CNN) that does not suffer from the problem of catastrophic forgetting when moving from one task to the other. We show that the activation maps generated by the CNN trained on the old task can be calibrated using very few calibration parameters, to become relevant to the new task. Based on this, we calibrate the activation maps produced by each network layer using spatial and channel-wise calibration modules and train only these calibration parameters for each new task in order to perform lifelong learning. Our calibration modules introduce significantly less computation and parameters as compared to the approaches that dynamically expand the network. Our approach is immune to catastrophic forgetting since we store the task-adaptive calibration parameters, which contain all the task-specific knowledge and is exclusive to each task. Further, our approach does not require storing data samples from the old tasks, which is done by many replay based methods. We perform extensive experiments on multiple benchmark datasets (SVHN, CIFAR, ImageNet, and MS-Celeb), all of which show substantial improvements over state-of-the-art methods (e.g., a 29% absolute increase in accuracy on CIFAR-100 with 10 classes at a time).

artificial intelligence, machine learning, module, (20 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre:

Instructional Material (0.73)
Research Report > Promising Solution (0.48)

Industry: Education > Educational Setting > Continuing Education (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

IKEA Manuals at Work: 4D Grounding of Assembly Instructions on Internet Videos

Neural Information Processing SystemsMay-31-2025, 11:24:19 GMT

Shape assembly is a ubiquitous task in daily life, integral for constructing complex 3D structures like IKEA furniture. While significant progress has been made in developing autonomous agents for shape assembly, existing datasets have not yet tackled the 4D grounding of assembly instructions in videos, essential for a holistic understanding of assembly in 3D space over time. We introduce IKEA Video Manuals, a dataset that features 3D models of furniture parts, instructional manuals, assembly videos from the Internet, and most importantly, annotations of dense spatio-temporal alignments between these data modalities. To demonstrate the utility of IKEA Video Manuals, we present five applications essential for shape assembly: assembly plan generation, part-conditioned segmentation, partconditioned pose estimation, video object segmentation, and furniture assembly based on instructional video manuals. For each application, we provide evaluation metrics and baseline methods. Through experiments on our annotated data, we highlight many challenges in grounding assembly instructions in videos to improve shape assembly, including handling occlusions, varying viewpoints, and extended assembly sequences.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Genre: