Reviews: Beyond Grids: Learning Graph Representations for Visual Recognition

Neural Information Processing Systems 

The paper proposes to learn graph representations from visual data via graph convolutional unit (GCU). It transforms a 2D feature maps extracted from a neural network to a sample-dependent graph, where pixels with similar features form a vertex and edges measure affinity of vertices in a feature space. Then graph convolutions are applied to pass information along the edges of the graph and update the vertex features. Finally, the updated vertex features are projected back to 2D grids based on the pixel-to-vertex assignment. GCU can be integrated into existing networks allowing end-to-end training and capturing long-range dependencies among regions (vertices).