Text2Vis: A Challenging and Diverse Benchmark for Generating Multimodal Visualizations from Text