Roles of Scaling and Instruction Tuning in Language Perception: Model vs. Human Attention