BigDocs: An Open and Permissively-Licensed Dataset for Training Multimodal Models on Document and Code Tasks