Join-Chain Network: A Logical Reasoning View of the Multi-head Attention in Transformer

Open in new window