R+X: Retrieval and Execution from Everyday Human Videos