Reinforcement Learning with Intrinsically Motivated Feedback Graph for Lost-sales Inventory Control