Toucan: Token-Aware Character Level Language Modeling