AITopics | taxnodes:Technology: Instructional Materials

Collaborating Authors

taxnodes:Technology: Instructional Materials

News Overviews Instructional Materials AI-Alerts Classics

Online Fast Adaptation and Knowledge Accumulation (OSAKA): a New Approach to Continual Learning

Neural Information Processing SystemsMar-20-2025, 11:25:48 GMT

Continual learning agents experience a stream of (related) tasks. The main challenge is that the agent must not forget previous tasks and also adapt to novel tasks in the stream. We are interested in the intersection of two recent continual-learning scenarios. In meta-continual learning, the model is pre-trained using meta-learning to minimize catastrophic forgetting of previous tasks. In continual-meta learning, the aim is to train agents for faster remembering of previous tasks through adaptation.

artificial intelligence, deep learning, machine learning, (13 more...)

Neural Information Processing Systems

Country:

Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.45)
North America > Canada > Quebec (0.28)

Genre:

Instructional Material (0.46)
Research Report > New Finding (0.46)

Industry:

Education (0.93)
Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)

Add feedback

bf201d5407a6509fa536afc4b380577e-Paper.pdf

Neural Information Processing SystemsMar-20-2025, 11:15:47 GMT

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America (0.46)

Genre: Instructional Material (0.96)

Industry:

Education (0.95)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Supplementary Material for " Training Over-parameterized Models with Non-decomposable Objectives " Algorithm 2 Reductions-based Algorithm for Constraining Coverage (2)

Neural Information Processing SystemsMar-20-2025, 11:14:46 GMT

Recall that the algorithms discussed in Section 2 have two intuitive steps: (i) update the multipliers based on the current classifier's performance and construct a gain matrix G; (ii) train a new classifier by optimizing a cost-sensitive loss ` These algorithms additionally incorporate the "two dataset" trick suggested by Cotter et al. [14] for better generalization, wherein the updates on are performed using a held-out validation set S In Algorithm 2, we seek to find a saddle-point for the Lagrangian max-min problem for (2). See Chen et al. [12], Cotter et al. [16] for theoretical guarantees for the learned classifier, which usually require the algorithms to output a stochastic classifier that averages over the individual iterates h Eban et al. [25] provide details of how the optimization of these metrics can be posed as constrained optimization problems, and in turn reduced More generally, using the reduction techniques from Narasimhan et al. [71], any learning problem of the following form can be reduced to cost-sensitive learning problems, and thus tackled by the proposed approach: max (C[h]) s.t. Narasimhan et al. [71] provide details of the Lagrangian primal-dual optimization for different metrics. We will find the following standard result to be useful in our proofs. Since the negative log is a strictly proper, in the sense of Gneiting and Raftery [30], Williamson et al. [95], we have that: Lemma 6 (Gneiting and Raftery [30], Williamson et al. [95]).

artificial intelligence, exp, machine learning, (18 more...)

Neural Information Processing Systems

Genre: Instructional Material (0.40)

Industry: Education > Focused Education > Special Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Deep Generalized Schrödinger Bridge

Neural Information Processing SystemsMar-20-2025, 10:57:39 GMT

Mean-Field Game (MFG) serves as a crucial mathematical framework in modeling the collective behavior of individual agents interacting stochastically with a large population. In this work, we aim at solving a challenging class of MFGs in which the differentiability of these interacting preferences may not be available to the solver, and the population is urged to converge exactly to some desired distribution. These setups are, despite being well-motivated for practical purposes, complicated enough to paralyze most (deep) numerical solvers. Nevertheless, we show that Schrödinger Bridge -- as an entropy-regularized optimal transport model -- can be generalized to accepting mean-field structures, hence solving these MFGs. This is achieved via the application of Forward-Backward Stochastic Differential Equations theory, which, intriguingly, leads to a computational framework with a similar structure to Temporal Difference learning. As such, it opens up novel algorithmic connections to Deep Reinforcement Learning that we leverage to facilitate practical training. We show that our proposed objective function provides necessary and sufficient conditions to the mean-field problem. Our method, named Deep Generalized Schrödinger Bridge (DeepGSB), not only outperforms prior methods in solving classical population navigation MFGs, but is also capable of solving 1000-dimensional opinion depolarization, setting a new state-of-the-art numerical solver for high-dimensional MFGs. Our code will be made available at https://github.com/ghliu/DeepGSB.

arxiv preprint arxiv, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Genre: Instructional Material > Course Syllabus & Notes (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

95c3f1a8b262ec7a929a8739e21142d7-Paper.pdf

Neural Information Processing SystemsMar-20-2025, 10:49:57 GMT

data mining, machine learning, teaching, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Instructional Material > Online (0.68)

Industry:

Information Technology (0.94)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Design from Policies: Conservative Test-Time Adaptation for Offline Policy Optimization Zifeng Zhuang 1,2

Neural Information Processing SystemsMar-20-2025, 10:19:36 GMT

Specifically, this non-iterative paradigm allows us to conduct inner-level optimization (value estimation) in training, while performing outer-level optimization (policy extraction) in testing. Naturally, such a paradigm raises three core questions that are not fully answered by prior non-iterative offline RL counterparts like rewardconditioned policy: Q1) What information should we transfer from the inner-level to the outer-level? Q2) What should we pay attention to when exploiting the transferred information for safe/confident outer-level optimization? Q3) What are the benefits of concurrently conducting outer-level optimization during testing? Motivated by model-based optimization (MBO), we propose DROP (Design fROm Policies), which fully answers the above questions. Specifically, in the inner-level, DROP decomposes offline data into multiple subsets and learns an MBO score model (A1). To keep safe exploitation to the score model in the outer-level, we explicitly learn a behavior embedding and introduce a conservative regularization (A2). During testing, we show that DROP permits test-time adaptation, enabling an adaptive inference across states (A3). Empirically, we find that DROP, compared to prior non-iterative offline RL counterparts, gains an average improvement probability of more than 80%, and achieves comparable or better performance compared to prior iterative baselines.

machine learning, optimization, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Genre:

Research Report (0.67)
Instructional Material (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

HEMM: Holistic Evaluation of Multimodal Foundation Models

Neural Information Processing SystemsMar-20-2025, 10:10:28 GMT

Multimodal foundation models that can holistically process text alongside images, video, audio, and other sensory modalities are increasingly used in a variety of realworld applications. However, it is challenging to characterize and study progress in multimodal foundation models, given the range of possible modeling decisions, tasks, and domains. In this paper, we introduce Holistic Evaluation of Multimodal Models (HEMM) to systematically evaluate the capabilities of multimodal foundation models across a set of 3 dimensions: basic skills, information flow, and real-world use cases. Basic multimodal skills are internal abilities required to solve problems, such as learning interactions across modalities, fine-grained alignment, multi-step reasoning, and the ability to handle external knowledge.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)
Instructional Material (0.67)

Industry:

Media > Film (1.00)
Information Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.93)
(6 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(6 more...)

Add feedback

Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning

Neural Information Processing SystemsMar-20-2025, 08:43:50 GMT

Recently formalized as the value equivalence principle, this algorithmic technique is perhaps unavoidable as real-world reinforcement learning demands consideration of a simple, computationally-bounded agent interacting with an overwhelmingly complex environment, whose underlying dynamics likely exceed the agent's capacity for representation. In this work, we consider the scenario where agent limitations may entirely preclude identifying an exactly value-equivalent model, immediately giving rise to a trade-off between identifying a model that is simple enough to learn while only incurring bounded sub-optimality.

artificial intelligence, machine learning, reinforcement learning, (9 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.46)

Genre: Instructional Material > Course Syllabus & Notes (0.68)

Technology: