Constrained Reinforcement Learning with Smoothed Log Barrier Function