This post is part of the series in which we are going to cover the following topics. In the previous blog we discussed about PyTorch, it's strengths and why should you learn it. We also had a brief look at Tensors – the core data structure in PyTorch. In this blog, we will jump into some hands-on examples of using pre-trained networks present in TorchVision module for Image Classification. Torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision.
PyTorch Image Models (timm) is a library for state-of-the-art image classification, containing a collection of image models, optimizers, schedulers, augmentations and much more; it was recently named the top trending library on papers-with-code of 2021! Whilst there are an increasing number of low and no code solutions which make it easy to get started with applying Deep Learning to computer vision problems, in my current role as part of Microsoft CSE, we frequently engage with customers who wish to pursue custom solutions tailored to their specific problem; utilizing the latest and greatest innovations to exceed the performance level offered by these services. Due to the rate that new architectures and training techniques are introduced into this rapidly moving field, whether you are a beginner or an expert, it can be difficult to keep up with the latest practices and make it challenging to know where to start when approaching new vision tasks with the intention of reproducing similar results to those presented in academic benchmarks. Whether I'm training from scratch or finetuning existing models to new tasks, and looking to leverage pre-existing components to speed up my workflow, timm is one of my favourite libraries for computer vision in PyTorch. However, whilst timm contains reference training and validation scripts for reproducing ImageNet training results and has documentation covering the core components in the official documentation and the timmdocs project, due to the sheer number of features that the library provides it can be difficult to know where to get started when applying these in custom use-cases. The purpose of this guide is to explore timm from a practitioner's point of view, focusing on how to use some of the features and components included in timm in custom training scripts. The focus is not to explore how or why these concepts work, or how they are implemented in timm; for this, links to the original papers will be provided where appropriate, and I would recommend timmdocs to learn more about timm's internals. Additionally, this article is by no means exhaustive, the areas selected are based upon my personal experience using this library. All information here is based on timm 0.5.4 which was recently released at the time of writing. Whilst this article can be read in order, it may also be useful as a reference for a particular part of the library. For ease of navigation, a table of contents is presented below.
Recently, Facebook announced the availability of the latest version of PyTorch, PyTorch 1.6. The social media giant also made a massive announcement that Microsoft has expanded its participation in the PyTorch community and is taking ownership of the development and maintenance of the PyTorch to build for Windows. PyTorch is one of the most popular machine learning libraries in Python. The version 1.6 release includes several new APIs, tools for performance improvement and profiling, as well as significant updates to both distributed data-parallel (DDP) and remote procedure call (RPC) based distributed training. According to the blog post, from this release onward, features will be classified as Stable, Beta and Prototype, where Prototype features are not included as part of the binary distribution and are instead available through either building from source, using nightlies or via a compiler flag. Automatic mixed precision (AMP) training is now natively supported and is a stable feature.
I plan to discuss interesting upcoming features primarily from TorchVision and secondary from the PyTorch ecosystem. My target is to highlight new and in-development features and provide clarity of what's happening in between the releases. Though the format is likely to change over time, I initially plan to keep it bite-sized and offer references for those who want to dig deeper. Finally, instead of publishing articles on fixed intervals, I'll be posting when I have enough interesting topics to cover. Disclaimer: The features covered will be biased towards topics I'm personally interested.