What can online reinforcement learning with function approximation benefit from general coverage conditions?