MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers