Application of data engineering approaches to address challenges in microbiome data for optimal medical decision-making