Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Making