A document processing pipeline for the construction of a dataset for topic modeling based on the judgments of the Italian Supreme Court