Mixture of Tokens: Efficient LLMs through Cross-Example Aggregation