Goto

Collaborating Authors

 dpdt



Decomposed Prompt Decision Transformer for Efficient Unseen Task Generalization

Neural Information Processing Systems

Multi-task offline reinforcement learning aims to develop a unified policy for diverse tasks without requiring real-time interaction with the environment. Recent work explores sequence modeling, leveraging the scalability of the transformer architecture as a foundation for multi-task learning. Given the variations in task content and complexity, formulating policies becomes a challenging endeavor, requiring careful parameter sharing and adept management of conflicting gradients to extract rich cross-task knowledge from multiple tasks and transfer it to unseen tasks. In this paper, we propose the Decomposed Prompt Decision Transformer (DPDT) that adopts a two-stage paradigm to efficiently learn prompts for unseen tasks in a parameter-efficient manner. We incorporate parameters from pre-trained language models (PLMs) to initialize DPDT, thereby providing rich prior knowledge encoded in language models. During the decomposed prompt tuning phase, we learn both cross-task and task-specific prompts on training tasks to achieve prompt decomposition. In the test time adaptation phase, the cross-task prompt, serving as a good initialization, were further optimized on unseen tasks through test time adaptation, enhancing the model's performance on these tasks. Empirical evaluation on a series of Meta-RL benchmarks demonstrates the superiority of our approach.



Decomposed Prompt Decision Transformer for Efficient Unseen Task Generalization

Neural Information Processing Systems

Multi-task offline reinforcement learning aims to develop a unified policy for diverse tasks without requiring real-time interaction with the environment. Recent work explores sequence modeling, leveraging the scalability of the transformer architecture as a foundation for multi-task learning. Given the variations in task content and complexity, formulating policies becomes a challenging endeavor, requiring careful parameter sharing and adept management of conflicting gradients to extract rich cross-task knowledge from multiple tasks and transfer it to unseen tasks. In this paper, we propose the Decomposed Prompt Decision Transformer (DPDT) that adopts a two-stage paradigm to efficiently learn prompts for unseen tasks in a parameter-efficient manner. We incorporate parameters from pre-trained language models (PLMs) to initialize DPDT, thereby providing rich prior knowledge encoded in language models.


Decision Tree Search as a Markov Decision Problem

Kohler, Hector, Akrour, Riad, Preux, Philippe

arXiv.org Artificial Intelligence

Finding an optimal decision tree for a supervised learning task is a challenging combinatorial problem to solve at scale. It was recently proposed to frame the problem as a Markov Decision Problem (MDP) and use deep reinforcement learning to tackle scaling. Unfortunately, these methods are not competitive with the current branch-and-bound state-of-the-art. We propose instead to scale the resolution of such MDPs using an information-theoretic tests generating function that heuristically, and dynamically for every state, limits the set of admissible test actions to a few good candidates. As a solver, we show empirically that our algorithm is at the very least competitive with branch-and-bound alternatives. As a machine learning tool, a key advantage of our approach is to solve for multiple complexity-performance trade-offs at virtually no additional cost. With such a set of solutions, a user can then select the tree that generalizes best and which has the interpretability level that best suits their needs, which no current branch-and-bound method allows.