Characterizing Multimodal Long-form Summarization: A Case Study on Financial Reports