Harmonizing Visual Text Comprehension and Generation