Swim Transformer is all you need for Computer Vision


The title might be a bit of a stretch, but let's look at why Swin Transformer is the latest State-Of-The-Art architecture. The Swin Transformer is the latest addition to the Transformer-based architecture for computer vision tasks. The Swin Transformer has proved to be a game-changer in computer vision tasks like object detection, image classification, semantic segmentation, and other vision tasks. The Swin Transformer uses Patch Merging and shifted window-based self-attention to achieve hierarchical representation and, reduced computational complexity respectively. In this post, we'll deep dive into the concepts and working of Swin Transformer and discuss why it performs well on computer vision tasks.

