Render and Diffuse: Aligning Image and Action Spaces for Diffusion-based Behaviour Cloning