The Shape of Learning: Anisotropy and Intrinsic Dimensions in Transformer-Based Models