What's Different between Visual Question Answering for Machine "Understanding" Versus for Accessibility?

Open in new window