Optimizing LLM Code Suggestions: Feedback-Driven Timing with Lightweight State Bounds

Awad, Mohammad Nour Al, Ivanov, Sergey, Tikhonova, Olga

arXiv.org Artificial Intelligence 

Abstract--Large Language Models (LLMs) have transformed code auto-completion by generating context-aware suggestions. Y et, deciding when to present these suggestions remains under-explored, often leading to interruptions or wasted inference calls. We propose an adaptive timing mechanism that dynamically adjusts the delay before offering a suggestion based on real-time developer feedback. Our suggested method combines a logistic transform of recent acceptance rates with a bounded delay range, anchored by a high-level binary prediction of the developer's cognitive state. In a two-month deployment with professional developers, our system improved suggestion acceptance from 4.9% with no delay to 15.4% with static delays, and to 18.6% with adaptive timing--while reducing blind rejections (rejections without being read) from 8.3% to 0.36%. T ogether, these improvements increase acceptance and substantially reduce wasted inference calls by 75%, making LLM-based code assistants more efficient and cost-effective in practice. Modern software development increasingly relies on AIpowered code assistants--most prominently LLM-based tools such as GitHub Copilot--which leverage massive pre-trained models to suggest context-aware completions and entire code snippets [1], [2]. These systems aim to boost productivity by reducing boilerplate and aiding API recall. Subsequent work on specialized code models (e.g., CodeBERT and CodeT5) has further improved completion accuracy and relevance [3]-[5]. Despite advances in what content to generate, the timing of suggestion delivery remains an underexplored yet critical factor.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found