Long Sequence Modeling with Attention Tensorization: From Sequence to Tensor Learning