Towards Federated Learning with On-device Training and Communication in 8-bit Floating Point