NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation