Fine-tuning LLaMA 2 interference: a comparative study of language implementations for optimal efficiency

Hossain, Sazzad, Seyam, Touhidul Alam, Chowdhury, Avijit, Xamidov, Munis, Ghose, Rajib, Pathak, Abhijit

Jan-30-2025–arXiv.org Artificial Intelligence

This paper conducts a comparative investigation to maximize the effectiveness of Llama2 inference, a critical task in machine learning and natural language processing (NLP). Various programming languages and frameworks, including TensorFlow, PyTorch, Python, Mojo, C++, and Java, are examined, assessing their speed, memory consumption, and ease of implementation through extensive testing and benchmarking. The advantages and disadvantages of each strategy are noted, with suggested optimization methods for parallel processing and hardware utilization. Additionally, the performance of the Mojo SDK, a novel framework designed for LLM inference on Apple Silicon, is investigated, comparing it against established implementations in C, C++, Rust, Zig, Go, and Julia. Through comprehensive benchmarking on an Apple M1 Max, Mojo SDK's competitive performance and its advantages in ease of use and Python compatibility are demonstrated, suggesting it is a compelling alternative for LLM inference on Apple Silicon. Implications for the future of LLM deployment on resource-limited hardware and potential avenues for further research is discussed.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

Jan-30-2025

arXiv.org PDF

Add feedback

Country:
- Europe > Russia
  - Central Federal District > Moscow Oblast > Moscow (0.04)
- Asia
  - Russia (0.04)
  - China (0.04)
  - Bangladesh > Dhaka Division
    - Dhaka District > Dhaka (0.04)

Genre:
- Research Report > New Finding (0.69)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found