Generating Descriptions from Structured Data Using a Bifocal Attention Mechanism and Gated Orthogonalization