AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment