Cross-Modal Entity Matching for Visually Rich Documents