Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage