A Comparison of Document Similarity Algorithms