Distributed Inference and Fine-tuning of Large Language Models Over The Internet

Oct-10-2024, 15:27:04 GMT–Neural Information Processing Systems

Large language models (LLMs) are useful in many NLP tasks and become more capable with size, with the best open-source models having over 50 billion parameters. However, using these 50B models requires high-end hardware, making them inaccessible to most researchers. In this work, we investigate methods for cost-efficient inference and fine-tuning of LLMs, comparing local and distributed strategies. We observe that a large enough model (50B) can run efficiently even on geodistributed devices in a consumer-grade network. This could allow running LLM efficiently by pooling together idle compute resources of multiple research groups and volunteers.

inference and fine-tuning, internet, language model, (2 more...)

Neural Information Processing Systems

Oct-10-2024, 15:27:04 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)