VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception