dpt
- Asia > China > Beijing > Beijing (0.05)
- Europe > Austria (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
- (3 more...)
Supervised Pretraining Can Learn In-Context Reinforcement Learning
Large transformer models trained on diverse datasets have shown a remarkable ability to learn in-context, achieving high few-shot performance on tasks they were not explicitly trained to solve. In this paper, we study the in-context learning capabilities of transformers in decision-making problems, i.e., reinforcement learning (RL) for bandits and Markov decision processes. To do so, we introduce and study the Decision-Pretrained Transformer (DPT), a supervised pretraining method where a transformer predicts an optimal action given a query state and an in-context dataset of interactions from a diverse set of tasks. While simple, this procedure produces a model with several surprising capabilities. We find that the trained transformer can solve a range of RL problems in-context, exhibiting both exploration online and conservatism offline, despite not being explicitly trained to do so.
- Asia > China > Beijing > Beijing (0.05)
- Europe > Austria (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
- (3 more...)
In-Context Curiosity: Distilling Exploration for Decision-Pretrained Transformers on Bandit Tasks
As large language models (LLMs) continue to grow in capability, there is increasing interest in incorporating them into decision-making tasks. A common pipeline for this is Decision-Pretrained Transformers (DPTs). However, existing training methods for DPTs often struggle to generalize beyond their pretraining data distribution. To explore mitigation of this limitation, we propose in-context curiosity -- a lightweight, exploration-inspired regularizer for offline pretraining -- and introduce the Prediction-Powered Transformer (PPT) framework. PPT augments DPT with an auxiliary reward predictor, using prediction error as an intrinsic curiosity signal to encourage broader exploration during training. In proof-of-concept experiments on Gaussian multi-armed bandits, PPT shows improved robustness: it moderates the performance degradation observed in DPT when test environments exhibit higher variance in reward, particularly when pretraining data has limited diversity. While the quality of offline data remain fundamental, our preliminary results suggest that curiosity-driven pretraining offers a promising direction for enhancing out-of-distribution generalization in in-context RL agents.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > North Carolina (0.04)
Is Selection All You Need in Differential Evolution?
Kitamura, Tomofumi, Fukunaga, Alex
Differential Evolution (DE) is a widely used evolutionary algorithm for black-box optimization problems. However, in modern DE implementations, a major challenge lies in the limited population diversity caused by the fixed population size enforced by the generational replacement. Population size is a critical control parameter that significantly affects DE performance. Larger populations inherently contain a more diverse set of individuals, thereby facilitating broader exploration of the search space. Conversely, when the maximum evaluation budgets is constrained, smaller populations focusing on a limited number of promising candidates may be more suitable. Many state-of-the-art DE variants incorporate an archive mechanism, in which a subset of discarded individuals is preserved in an archive during generation replacement and reused in mutation operations. However, maintaining what is essentially a secondary population via an archive introduces additional design considerations, such as policies for insertion, deletion, and appropriate sizing. To address these limitations, we propose a novel DE framework called Unbounded Differential Evolution (UDE), which adds all generated candidates to the population without discarding any individual based on fitness. Unlike conventional DE, which removes inferior individuals during generational replacement, UDE eliminates replacement altogether, along with the associated complexities of archive management and dynamic population sizing. UDE represents a fundamentally new approach to DE, relying solely on selection mechanisms and enabling a more straightforward yet powerful search algorithm.
- North America > United States > Michigan (0.04)
- Europe > Germany (0.04)
- Asia > South Korea (0.04)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.68)
Supervised Pretraining Can Learn In-Context Reinforcement Learning
Large transformer models trained on diverse datasets have shown a remarkable ability to learn in-context, achieving high few-shot performance on tasks they were not explicitly trained to solve. In this paper, we study the in-context learning capabilities of transformers in decision-making problems, i.e., reinforcement learning (RL) for bandits and Markov decision processes. To do so, we introduce and study the Decision-Pretrained Transformer (DPT), a supervised pretraining method where a transformer predicts an optimal action given a query state and an in-context dataset of interactions from a diverse set of tasks. While simple, this procedure produces a model with several surprising capabilities. We find that the trained transformer can solve a range of RL problems in-context, exhibiting both exploration online and conservatism offline, despite not being explicitly trained to do so.
Discriminative Probing and Tuning for Text-to-Image Generation
Qu, Leigang, Wang, Wenjie, Li, Yongqi, Zhang, Hanwang, Nie, Liqiang, Chua, Tat-Seng
Despite advancements in text-to-image generation (T2I), prior methods often face text-image misalignment problems such as relation confusion in generated images. Existing solutions involve cross-attention manipulation for better compositional understanding or integrating large language models for improved layout planning. However, the inherent alignment capabilities of T2I models are still inadequate. By reviewing the link between generative and discriminative modeling, we posit that T2I models' discriminative abilities may reflect their text-image alignment proficiency during generation. In this light, we advocate bolstering the discriminative abilities of T2I models to achieve more precise text-to-image alignment for generation. We present a discriminative adapter built on T2I models to probe their discriminative abilities on two representative tasks and leverage discriminative fine-tuning to improve their text-image alignment. As a bonus of the discriminative adapter, a self-correction mechanism can leverage discriminative gradients to better align generated images to text prompts during inference. Comprehensive evaluations across three benchmark datasets, including both in-distribution and out-of-distribution scenarios, demonstrate our method's superior generation performance. Meanwhile, it achieves state-of-the-art discriminative performance on the two discriminative tasks compared to other generative models.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > Singapore (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- (5 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Developmental Pretraining (DPT) for Image Classification Networks
Rajesh, Niranjan, Gupta, Debayan
The advent of Deep Learning (DL) has massively aided the Artificial Intelligence community, especially in the realm of object recognition. One of the critical reasons for the success of DL has been the availability of massive image datasets [1] and the computational power offered by modern Graphics Processing Units (GPUs) that are able to accommodate the large amounts of data required by Deep Networks. State-of-the-art image recognition networks like the ResNet family [2], VGG networks [3], EfficientNet models [4] and the recently introduced Vision Transformers [5] require extremely large amounts of data compared to their classical Machine Learning (ML) counterparts [6]. This characteristic requirement for large amounts of data becomes a problem in fields where data availability is low like in medical fields [7]. A common approach to this problem is Transfer Learning [8] which consists of pre-training a network on a large dataset like ImageNet [1] and fine tune the network on a smaller dataset that is relevant to the recognition problem at hand.
- Health & Medicine (0.89)
- Education (0.69)