Instruction-Guided Fusion of Multi-Layer Visual Features in Large Vision-Language Models