Federated learning with hierarchical clustering of local updates to improve training on non-IID data