Multimodal Structured Generation: CVPR's 2nd MMFM Challenge Technical Report