Integrating Form and Meaning: A Multi-Task Learning Model for Acoustic Word Embeddings