Pic@Point: Cross-Modal Learning by Local and Global Point-Picture Correspondence