LoRATv2: Enabling Low-Cost Temporal Modeling in One-Stream Trackers

Jun-13-2026, 20:22:35 GMT–Neural Information Processing Systems

Transformer-based algorithms, such as LoRAT, have significantly enhanced object-tracking performance. However, these approaches rely on a standard attention mechanism, which incurs quadratic token complexity, making real-time inference computationally expensive. In this paper, we introduce LoRATv2, a novel tracking framework that addresses these limitations with three main contributions. First, LoRATv2 integrates frame-wise causal attention, which ensures full self-attention within each frame while enabling causal dependencies across frames, significantly reducing computational overhead. Moreover, key-value (KV) caching is employed to efficiently reuse past embeddings for further speedup.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Jun-13-2026, 20:22:35 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.56)