TRIM: Token Reduction and Inference Modeling for Cost-Effective Language Generation