Terminology-based Text Embedding for Computing Document Similarities on Technical Content