A systematic review of challenges and proposed solutions in modeling multimodal data