Quantification and object perception in Multimodal Large Language Models deviate from human linguistic cognition