Reinforcement Learning Outperforms Supervised Fine-Tuning: A Case Study on Audio Question Answering