Reviews: High-Order Attention Models for Visual Question Answering

Neural Information Processing Systems 

I would suggest trying to simplify this figure to emphasize the unary/pairwise/trinary potential generation more clearly.