LLM Agents for Generating Microservice-based Applications: how complex is your specification?
–arXiv.org Artificial Intelligence
In this paper we evaluate the capabilities of LLM Agents in generating code for real-world problems. Specifically, we explore code synthesis for microservice-based applications, a widely used architectural pattern for building applications. We define a standard template for specifying these applications, and we propose a metric for scoring the difficulty of a specification. The higher the score, the more difficult it is to generate code for the specification. Our experimental results show that agents using strong LLMs (like GPT-3o-mini) do fairly well on medium difficulty specifications but do poorly on those of higher difficulty levels. This is due to more intricate business logic, a greater use of external services, database integration and inclusion of non-functional capabilities such as authentication. We analyzed the errors in LLM-synthesized code and report on the key challenges LLM Agents face in generating code for these specifications. Finally, we show that using a fine-grained approach to code generation improves the correctness of the generated code.
arXiv.org Artificial Intelligence
Oct-28-2025
- Country:
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Consumer Products & Services > Restaurants (0.47)
- Information Technology (0.35)
- Technology: