OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction