Scaled and Inter-token Relation Enhanced Transformer for Sample-restricted Residential NILM
Rahman, Minhajur, Arafat, Yasir
–arXiv.org Artificial Intelligence
Transformers have demonstrated exceptional performance across various domains due to their self-attention mechanism, which captures complex relationships in data. However, training on smaller datasets poses challenges, as standard attention mechanisms can over-smooth attention scores and overly prioritize intra-token relationships, reducing the capture of meaningful inter-token dependencies critical for tasks like Non-Intrusive Load Monitoring (NILM). To address this, we propose a novel transformer architecture with two key innovations: inter-token relation enhancement and dynamic temperature tuning. The inter-token relation enhancement mechanism removes diagonal entries in the similarity matrix to improve attention focus on inter-token relations. The dynamic temperature tuning mechanism, a learnable parameter, adapts attention sharpness during training, preventing over-smoothing and enhancing sensitivity to token relationships. We validate our method on the REDD dataset and show that it outperforms the original transformer and state-of-the-art models by 10-15\% in F1 score across various appliance types, demonstrating its efficacy for training on smaller datasets.
arXiv.org Artificial Intelligence
Dec-6-2024
- Country:
- Asia > Bangladesh (0.04)
- North America > United States
- California > San Diego County > San Diego (0.04)
- Genre:
- Research Report (1.00)
- Industry:
- Energy > Power Industry (0.68)
- Technology: