Offline Reinforcement Learning with Realizability and Single-policy Concentrability