VSE-ens: Visual-Semantic Embeddings with Efficient Negative Sampling