Performance Review on LLM for solving leetcode problems

Wang, Lun, Shi, Chuanqi, Du, Shaoshui, Tao, Yiyi, Shen, Yixian, Zheng, Hang, Shen, Yanxin, Qiu, Xinyu

Mar-2-2025–arXiv.org Artificial Intelligence

This paper presents a comprehensive performance evaluation of Large Language Models (LLMs) in solving programming challenges from Leetcode, a widely used platform for algorithm practice and technical interviews. We began by crawling the Leetcode website to collect a diverse set of problems encompassing various difficulty levels and topics. Using this dataset, we generated solutions with multiple LLMs, including GPT-4 and GPT-3.5-turbo (ChatGPT-turbo). The generated solutions were systematically evaluated for correctness and efficiency. We employed the pass@k metric to assess the success rates within a given number of attempts and analyzed the runtime performance of the solutions. Our results highlight the strengths and limitations of current LLMs [10] in code generation and problem-solving tasks, providing insights into their potential applications and areas for improvement in automated programming assistance.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Mar-2-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States > California (0.15)

Genre:
- Research Report > New Finding (0.49)

Industry:
- Information Technology > Security & Privacy (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.88)
  - Natural Language > Large Language Model (1.00)