It is possible to train just a neural network to answer questions about a scene by feeding in millions of examples as training data. But a human child doesn't require such a vast amount of data in order to grasp what a new object is or how it relates to other objects. Also, a network trained that way has no real understanding of the concepts involved--it's just a vast pattern-matching exercise. So such a system would be prone to making very silly mistakes when faced with new scenarios. This is a common problem with today's neural networks and underpins shortcomings that are easily exposed (see "AI's language problem").
Oct-4-2019, 09:37:58 GMT