Provable Offline Reinforcement Learning for Structured Cyclic MDPs