Learning Neural Sequence-to-Sequence Models from Weak Feedback with Bipolar Ramp Loss