Prompt-with-Me: in-IDE Structured Prompt Management for LLM-Driven Software Engineering

Li, Ziyou, Sergeyuk, Agnia, Izadi, Maliheh

arXiv.org Artificial Intelligence 

Abstract--Large Language Models are transforming software engineering, yet prompt management in practice remains ad hoc, hindering reliability, reuse, and integration into industrial workflows. We present Prompt-with-Me, a practical solution for structured prompt management embedded directly in the development environment. The system automatically classifies prompts using a four-dimensional taxonomy encompassing intent, author role, software development lifecycle stage, and prompt type. T o enhance prompt reuse and quality, Prompt-with-Me suggests language refinements, masks sensitive information, and extracts reusable templates from a developer's prompt library. Our taxonomy study of 1,108 real-world prompts demonstrates that modern LLMs can accurately classify software engineering prompts. Furthermore, our user study with 11 participants shows strong developer acceptance, with high usability (Mean SUS=73), low cognitive load (Mean NASA-TLX=21), and reported gains in prompt quality and efficiency through reduced repetitive effort. Lastly, we offer actionable insights for building the next generation of prompt management and maintenance tools for software engineering workflows. Large Language Models (LLMs) such as GPT-4 and Claude are rapidly transforming modern software engineering (SE). Tools powered by these models have moved from experimental prototypes to integral components of everyday development workflows. For instance, industry reports on GitHub Copilot, a GPT-powered AI pair programmer, show that developers using the tool complete coding tasks up to 55% faster, while 85% report increased confidence in their code quality [1]. These numbers highlight not only the efficiency gains brought by LLMs but also their growing influence on developer experience and software quality. Despite the transformative role of LLMs in software engineering, the primary interface for human-LLM interaction, the prompt, remains surprisingly informal. Prompts are typically written ad hoc, often with fragmented grammar or inconsistent wording, even though minor phrasing changes can significantly alter model behavior [2]. Unlike source code, prompts are rarely versioned, reviewed, or maintained across the software development lifecycle, leaving critical interactions with LLMs effectively unmanaged.