apt
Better with Less
The proposed predictive uncertainty, as feedback from the pre-training model, measures the confidence level of the model in the data. When fed with the chosen data, on the other hand, the pre-training model grasps an initial understanding of the new, unseen data, and at the same time attempts to remember the knowledge learned from previous data.
- South America > Brazil (0.05)
- North America > United States > Wisconsin (0.05)
- North America > United States > Michigan (0.04)
- (4 more...)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.16)
- Oceania > New Zealand (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
Better with Less: A Data-Active Perspective on Pre-Training Graph Neural Networks
Pre-training on graph neural networks (GNNs) aims to learn transferable knowledge for downstream tasks with unlabeled data, and it has recently become an active research area. The success of graph pre-training models is often attributed to the massive amount of input data. In this paper, however, we identify the curse of big data phenomenon in graph pre-training: more training data do not necessarily lead to better downstream performance. Motivated by this observation, we propose a better-with-less framework for graph pre-training: fewer, but carefully chosen data are fed into a GNN model to enhance pre-training. The proposed pre-training pipeline is called the data-active graph pre-training (APT) framework, and is composed of a graph selector and a pre-training model.
Behavior From the Void: Unsupervised Active Pre-Training
We introduce a new unsupervised pre-training method for reinforcement learning called APT, which stands for Active Pre-Training. APT learns behaviors and representations by actively searching for novel states in reward-free environments. The key novel idea is to explore the environment by maximizing a non-parametric entropy computed in an abstract representation space, which avoids challenging density modeling and consequently allows our approach to scale much better in environments that have high-dimensional observations (e.g., image observations). We empirically evaluate APT by exposing task-specific reward after a long unsupervised pre-training phase. In Atari games, APT achieves human-level performance on 12 games and obtains highly competitive performance compared to canonical fully supervised RL algorithms. On DMControl suite, APT beats all baselines in terms of asymptotic performance and data efficiency and dramatically improves performance on tasks that are extremely difficult to train from scratch.
APT: Affine Prototype-Timestamp For Time Series Forecasting Under Distribution Shift
Li, Yujie, Shao, Zezhi, Yu, Chengqing, Fu, Yisong, Sun, Tao, Xu, Yongjun, Wang, Fei
Time series forecasting under distribution shift remains challenging, as existing deep learning models often rely on local statistical normalization (e.g., mean and variance) that fails to capture global distribution shift. Methods like RevIN and its variants attempt to decouple distribution and pattern but still struggle with missing values, noisy observations, and invalid channel-wise affine transformation. To address these limitations, we propose Affine Prototype-Timestamp (APT), a lightweight and flexible plug-in module that injects global distribution features into the normalization-forecasting pipeline. By leveraging timestamp-conditioned prototype learning, APT dynamically generates affine parameters that modulate both input and output series, enabling the backbone to learn from self-supervised, distribution-aware clustered instances. APT is compatible with arbitrary forecasting backbones and normalization strategies while introducing minimal computational overhead. Extensive experiments across six benchmark datasets and multiple backbone-normalization combinations demonstrate that APT significantly improves forecasting performance under distribution shift.
- Asia > China (0.04)
- Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
- Oceania > New Zealand (0.04)
- (7 more...)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > British Columbia > Vancouver (0.04)
- (10 more...)
Accelerating Vision Transformers with Adaptive Patch Sizes
Choudhury, Rohan, Kim, JungEun, Park, Jinhyung, Yang, Eunho, Jeni, László A., Kitani, Kris M.
Vision Transformers (ViTs) partition input images into uniformly sized patches regardless of their content, resulting in long input sequence lengths for high-resolution images. We present Adaptive Patch Transformers (APT), which addresses this by using multiple different patch sizes within the same image. APT reduces the total number of input tokens by allocating larger patch sizes in more homogeneous areas and smaller patches in more complex ones. APT achieves a drastic speedup in ViT inference and training, increasing throughput by 40% on ViT -L and 50% on ViT -H while maintaining downstream performance. It can be applied to a previously fine-tuned ViT and converges in as little as 1 epoch. It also significantly reduces training and inference time without loss of performance in high-resolution dense visual tasks, achieving up to 30% faster training and inference in visual QA, object detection, and semantic segmentation. Our project page is available at this link. Vision Transformers (ViTs) (Dosovitskiy et al., 2020) have become the dominant paradigm for visual recognition, but their scalability is limited by the quadratic cost of self-attention with respect to sequence length. Since inputs are divided into fixed-size patches, image resolution directly determines sequence length: higher resolution images yield disproportionately long token sequences despite much higher redundancy. Many prior works have proposed solutions to this issue, typically by merging a fixed proportion of similar tokens (Bolya et al., 2022) or pruning uninformative ones with auxiliary predictors (Rao et al., 2021; Yin et al., 2022). While these reduce theoretical FLOPs, they face two drawbacks.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Germany > Berlin (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- South America > Brazil (0.05)
- North America > United States > Wisconsin (0.05)
- North America > United States > Michigan (0.04)
- (9 more...)
Disentangling Score Content and Performance Style for Joint Piano Rendering and Transcription
Zeng, Wei, Zhao, Junchuan, Wang, Ye
Expressive performance rendering (EPR) and automatic piano transcription (APT) are fundamental yet inverse tasks in music information retrieval: EPR generates expressive performances from symbolic scores, while APT recovers scores from performances. Despite their dual nature, prior work has addressed them independently. In this paper we propose a unified framework that jointly models EPR and APT by disentangling note-level score content and global performance style representations from both paired and unpaired data. Our framework is built on a transformer-based sequence-to-sequence architecture and is trained using only sequence-aligned data, without requiring fine-grained note-level alignment. To automate the rendering process while ensuring stylistic compatibility with the score, we introduce an independent diffusion-based performance style recommendation module that generates style embeddings directly from score content. This modular component supports both style transfer and flexible rendering across a range of expressive styles. Experimental results from both objective and subjective evaluations demonstrate that our framework achieves competitive performance on EPR and APT tasks, while enabling effective content-style disentanglement, reliable style transfer, and stylistically appropriate rendering. Demos are available at https://jointpianist.github.io/epr-apt/
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- (21 more...)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Speech (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)