In this post, I will be showing a simple example of object recognition in images using the TensorFlow library from Smalltalk. Whenever you start entering the world of AI and Machine Learning you will notice immediately that Python has been widely accepted as the "default" programming language for these topics. I am not against Python and I believe that people are using it for a reason. However, I do believe that providing alternatives is a good thing, too. And Smalltalk could be that alternative you are looking for.
In this work, orientation detection using Deep Learning is acknowledged for a particularly vulnerable class of road users,the cyclists. Knowing the cyclists' orientation is of great relevance since it provides a good notion about their future trajectory, which is crucial to avoid accidents in the context of intelligent transportation systems. Using Transfer Learning with pre-trained models and TensorFlow, we present a performance comparison between the main algorithms reported in the literature for object detection,such as SSD, Faster R-CNN and R-FCN along with MobilenetV2, InceptionV2, ResNet50, ResNet101 feature extractors. Moreover, we propose multi-class detection with eight different classes according to orientations. To do so, we introduce a new dataset called "Detect-Bike", containing 20,229 cyclist instances over 11,103 images, which has been labeled based on cyclist's orientation. Then, the same Deep Learning methods used for detection are trained to determine the target's heading. Our experimental results and vast evaluation showed satisfactory performance of all of the studied methods for the cyclists and their orientation detection, especially using Faster R-CNN with ResNet50 proved to be precise but significantly slower. Meanwhile, SSD using InceptionV2 provided good trade-off between precision and execution time, and is to be preferred for real-time embedded applications.
Attention readers: We invite you to access the corresponding Python code and iPython notebook for this article on GitHub. Image classification can perform some pretty amazing feats, but a large drawback of many image classification applications is that the model can only detect one class per image. With an object detection model, not only can you classify multiple classes in one image, but you can specify exactly where that object is in an image with a bounding box framing the object. The TensorFlow Models GitHub repository has a large variety of pre-trained models for various machine learning tasks, and one excellent resource is their object detection API. The object detection API makes it extremely easy to train your own object detection model for a large variety of different applications.
Surveillance is an integral part of security and patrol. For the most part, the job entails extended periods of looking out for something undesirable to happen. It is crucial that we do this, but also it is a very mundane task. Wouldn't life be much simpler if there was something that could do the "watching and waiting" for us? With the advancements in technology over the past few years, we could write some scripts to automate the above tasks -- and that too, rather easily. Anyone familiar with Deep Learning would know that image classifiers have surpassed human level accuracy.
In recent years, neural networks and deep learning have sparked tremendous progress in the field of natural language processing (NLP) and computer vision. While many of the face, object, landmark, logo, and text recognition and detection technologies are provided for Internet-connected devices, we believe that the ever-increasing computational power of mobile devices can enable the delivery of these technologies into the hands of users anytime, anywhere, regardless of Internet connection. However, computer vision for on-device and embedded applications faces many challenges -- models must run quickly with high accuracy in a resource-constrained environment, making use of limited computation, power, and space. TensorFlow offers various pre-trained models, such as drag-and-drop models, in order to identify approximately 1,000 default objects. When compared with other similar models, such as the Inception model datasets, MobileNet works better with latency, size, and accuracy.