FAIR Enough: How Can We Develop and Assess a FAIR-Compliant Dataset for Large Language Models' Training?