AI Methods for Understanding Images & Pictures
There are fundamental questions to be answered about the architecture of a visual system. For nearly two decades, the field has assumed that the visual system can be decomposed into independent modules, each performing a well-defined function, like estimating color, and that their outputs are integrated at a later stage.
Is this a valid hypothesis?
- Harry G. Barrow & J. M. Tenenbaum, Retrospective on "Interpreting Line Drawings as 3-Dimensional Surfaces"
Vision involves both the acquisition and processing of visual information. AI powered technologies have made possible such astounding achievements as vehicles that are able to safely steer themselves along our superhighways, and computers that can recognize and interpret facial expressions. Just consider the complexity of the analysis that your brain must engage in before determining that something as apparently simple as the fact that the black squares on a chess board are not holes, but rather part of the surface, and you'll have an idea of how sophisticated vision systems must be in order to reliably perform their objectives.
And there is so much more to vision than meets the eye, such as when fog or snow obscures a portion of the road ahead of you (or imagine that you are scuba diving in murky waters...). Just as you are able to fill in the missing pieces of vaguely defined areas based upon experience and your general knowledge of the environment, AI programs make possible the enhancement, interpretation, recognition, identification and other processing of partial images.
AI vision technology has made possible such applications as: image stabilization, 3D modeling, image synthesis, surgical navigation, handwritten document recognition, and vision based computer interfaces. Follow the links below to see what vision projects AI scientists are currently working on.
Definition of the Field
"What exactly is computer vision then? Computer vision is a research field working to equip computers with the ability to process and understand visual data, as sighted humans can. Human brains process the gigabytes of data passing through our eyes every second and translate that data into sight - that is, into discrete objects and entities we can recognise or understand. Similarly, computer vision aims to give computers the ability to understand what they are seeing, and act intelligently on that knowledge."
- Computer vision: Cheat Sheet. ZDNet.com (December 6, 2011), by Natasha Lomas.