Reviews: Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding

Oct-7-2024, 10:58:55 GMT–Neural Information Processing Systems

This paper uses neural networks to parse visual scenes and language queries, transforming them into a logical representation that can be used to compute the output of the query on the scene. The logical representation is learned via a combination of direct supervision via a small number of traces and fine-tuning using end-to-end reinforcement learning. Advantages of the approach over existing approaches include: Reduction in the number of training examples, a more interpretable inference process and substantially increased accuracy. The overall approach shows great promise in increasing the performance of neural architectures by incorporating a symbolic component, as well as making them more robust, interpretable and debuggable. So I think this is a good direction for AI research to go in.

disentangling reasoning, logical representation, neural-symbolic vqa, (3 more...)

Neural Information Processing Systems

Oct-7-2024, 10:58:55 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks (0.59)
  - Inductive Learning (0.59)