RGB-D Grasp Detection via Depth Guided Learning with Cross-modal Attention