Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks