A Parameter-Efficient Tuning Framework for Language-guided Object Grounding and Robot Grasping