What Factors Affect Multi-Modal In-Context Learning? An In-Depth Exploration