Instant Visual Translation Example of instant visual translation, taken from the Google Blog. A more complex variation of this task called object detection involves specifically identifying one or more objects within the scene of the photograph and drawing a box around them. In 2014, there were an explosion of deep learning algorithms achieving very impressive results on this problem, leveraging the work from top models for object classification and object detection in photographs. Generally, the systems involve the use of very large convolutional neural networks for the object detection in the photographs and then a recurrent neural network like an LSTM to turn the labels into a coherent sentence.
Aug-6-2017, 00:30:31 GMT