Improved Regret Bound for Safe Reinforcement Learning via Tighter Cost Pessimism and Reward Optimism