Collaborating Authors

Image Understanding

Neural Style Transfer


Leon Gatys et al. introduced the Neural Style Transfer technique in 2015 in "A Neural Algorithm of Artistic Style". As stated earlier, Neural Style Transfer is a technique of composing images in the style of another image. Neural Style Transfer (NST) refers to a class of software algorithms that manipulate digital images or videos to adapt the appearance or visual style of another image. NST algorithms are characterized by their use of deep neural networks for the sake of image transformation. If you want to go deep into the original technique, you can refer to the paper from this link.

Run image classification with Amazon SageMaker JumpStart


Last year, AWS announced the general availability of Amazon SageMaker JumpStart, a capability of Amazon SageMaker that helps you quickly and easily get started with machine learning (ML). JumpStart hosts 196 computer vision models, 64 natural language processing (NLP) models, 18 pre-built end-to-end solutions, and 19 example notebooks to help you get started with using SageMaker. These models can be quickly deployed and are pre-trained open-source models from PyTorch Hub and TensorFlow Hub. These models solve common ML tasks such as image classification, object detection, text classification, sentence pair classification, and question answering. The example notebooks show you how to use the 17 SageMaker built-in algorithms and other features of SageMaker.

Image Classification Fulltraining


Welcome to our end-to-end example of distributed image classification algorithm. In this demo, we will use the Amazon sagemaker image classification algorithm to train on the caltech-256 dataset. To get started, we need to set up the environment with a few prerequisite steps, for permissions, configurations, and so on. Here we set up the linkage and authentication to AWS services. Download the data and transfer to S3 for use in training.

Attention based CNN for Image Classification


Implementing a deep learning attention based classification model proposed in the paper "Learn To Pay Attention" published in ICLR 2018 conference. The basic idea behind attention models is to focus on that parts of a problem which are important. Such a model was introduced in 2014 and was mainly focused on solving NLP problem but eventually was found to be useful in the field of computer vision. Jetley in the paper "Learn To Pay Attention" used attention based mechanism to solve simple image classification problem. The most important concept discused in this paper would be'attention maps' which is a scalar matrix that represents activations of different locations of an image with respect to a target. With the help of attention maps the CNNs will eventually learn which part of an image is important for a particaular task.

Deploy Computer Vision Flask Web App using Python in CLOUD


Image Processing & classification is one of the areas of Data Science and has a wide variety of applications in the industries in the current world. We start the course by learning Scikit Image for image processing which is the essential skill required and then we will do the necessary preprocessing techniques & feature extraction to an image like HOG. After that we will start building the project. In this course you will learn how to label the images, image data preprocessing and analysis using scikit image and python. Then we will train machine learning here we will see Stochastic Gradient Descenct Classifier for image classification and followed by model evaluation proces and pipeline the machine learning model.

Python for Art -- Fast Neural Style Transfer using TensorFlow 2


In this final step, we are going to visualize the final result to see the before and after of our snapshot. We will call the visualize function that we defined in the functions step. As you can see, our character in the original shot blends so well with the painting. The neural style transfer has definitely given the photo more deepness and emotion. Let me know you thoughts.

Deploying ML Models to the Edge using Azure DevOps


Training ML Models and exporting it in more optimized way for Edge device from scratch is quite challenging thing to do especially for a beginner in ML space. Interestingly Azure Cognitive Services will aid in heavy lifting half of the common problems such as Image Classification, Speech Recognition etc. So in this article, I will show you how I created a simple pipeline(kind of MLOps) that deploys the model to an Edge Device leveraging Azure IoT Modules and Azure DevOps Services. Blob Storage – For storing images for ML training 2. Logic Apps – To respond Blob storage upload events and trigger a Post REST API call to Azure Pipelines 3. Cognitive Services – For training Images and generate a optimized model specifically for edge devices. Containerized Az Devops Agents will be running inside this, orchestrated using K3s Kubernetes Distribution.

Gastric Histopathology Image Classification by Transformer, GasHis-Transformer


GasHis-Transformer is a model for realizing gastric histopathological image classification (GHIC), which automatically classifies microscopic images of the stomach into normal and abnormal cases in gastric cancer diagnosis, as shown in the figure. GasHis-Transformer is a multi-scale image classification model that combines the best features of Vision Transformer (ViT) and CNN, where ViT is good for global information and CNN is good for local information. GasHis-Transformer consists of two important modules, Global Information Module ( GIM) and Local Information Module ( LIM), as shown in the figure below. GasHisTransformer has high classification performance on the test data of gastric histopathology dataset, with estimate precision, recall, F1-score, and accuracy of 98.0%, 100.0%, 96.0%, and 98.0%, respectively. GasHisTransformer consists of two modules: Global Information Module (GIM) and Local Information Module (LIM).

Visual Probing: Cognitive Framework for Explaining Self-Supervised Image Representations Artificial Intelligence

Recently introduced self-supervised methods for image representation learning provide on par or superior results to their fully supervised competitors, yet the corresponding efforts to explain the self-supervised approaches lag behind. Motivated by this observation, we introduce a novel visual probing framework for explaining the self-supervised models by leveraging probing tasks employed previously in natural language processing. The probing tasks require knowledge about semantic relationships between image parts. Hence, we propose a systematic approach to obtain analogs of natural language in vision, such as visual words, context, and taxonomy. Our proposal is grounded in Marr's computational theory of vision and concerns features like textures, shapes, and lines. We show the effectiveness and applicability of those analogs in the context of explaining self-supervised representations. Our key findings emphasize that relations between language and vision can serve as an effective yet intuitive tool for discovering how machine learning models work, independently of data modality. Our work opens a plethora of research pathways towards more explainable and transparent AI.

X-ray Image Classification and Model Evaluation


Kaggle has a wonderful source of chest X-ray image datasets for pneumonia and normal cases. There are significant differences between the image of a normal X-ray and an affected X-ray. Machine learning can play a pivotal role in determining the disease and significantly boost the diagnosis time as well as reduce human effort. I have been motivated by the work done here on the datasets between cats and dogs and reused the code block for dataset pipeline. First we need to import the necessary packages.