Multimodal Language Models See Better When They Look Shallower