TRIM: Token Reduction and Inference Modeling for Cost-Effective Language Generation

Ruiz, Alfredo Garrachón, de la Rosa, Tomás, Borrajo, Daniel

Dec-18-2024–arXiv.org Artificial Intelligence

This Large language models (LLMs) have shown remarkable approach is orthogonal to other optimization techniques, capabilities across a wide range of tasks, and could be applicable as LLMs continue from natural language understanding to creative to grow in size and capabilities. We also propose content generation. However, the computational an algorithm to check and define the applicability cost of inference and the associated energy consumption of this technique in different domains, selecting present significant challenges. As the demand the most proper function words set, and analyzing for AI applications continues to grow, these the lose in performance as the percentage of costs are expected to escalate, raising concerns saved tokens increases. Additionally, we provide about sustainability and accessibility (Wu et al., an experimental evaluation in the context of general 2022).

function word, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

Dec-18-2024

arXiv.org PDF

Add feedback

Genre:
- Research Report > New Finding (0.46)

Industry:
- Banking & Finance (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.48)
  - Natural Language > Large Language Model (1.00)