Enhancing Reliability across Short and Long-Form QA via Reinforcement Learning