Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation