CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos