Provable test-time adaptivity and distributional robustness of in-context learning