Lessons on Datasets and Paradigms in Machine Learning for Symbolic Computation: A Case Study on CAD