Beyond BEV: Optimizing Point-Level Tokens for Collaborative Perception