Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback