Improved Regret Bounds for Online Fair Division with Bandit Learning