Smarter, not Bigger: Fine-Tuned RAG-Enhanced LLMs for Automotive HIL Testing
Feng, Chao, Liu, Zihan, Gupta, Siddhant, Cui, Gongpei, von der Assen, Jan, Stiller, Burkhard
–arXiv.org Artificial Intelligence
Hardware-in-the-Loop (HIL) testing is essential for automotive validation but suffers from fragmented and underutilized test artifacts. This paper presents HIL-GPT, a retrieval-augmented generation (RAG) system integrating domain-adapted large language models (LLMs) with semantic retrieval. HIL-GPT leverages embedding fine-tuning using a domain-specific dataset constructed via heuristic mining and LLM-assisted synthesis, combined with vector indexing for scalable, traceable test case and requirement retrieval. Experiments show that fine-tuned compact models, such as \texttt{bge-base-en-v1.5}, achieve a superior trade-off between accuracy, latency, and cost compared to larger models, challenging the notion that bigger is always better. An A/B user study further confirms that RAG-enhanced assistants improve perceived helpfulness, truthfulness, and satisfaction over general-purpose LLMs. These findings provide insights for deploying efficient, domain-aligned LLM-based assistants in industrial HIL environments.
arXiv.org Artificial Intelligence
Dec-1-2025
- Country:
- Europe
- Netherlands > Drenthe
- Assen (0.04)
- Sweden > Vaestra Goetaland
- Gothenburg (0.04)
- Switzerland > Zürich
- Zürich (0.14)
- Netherlands > Drenthe
- North America > United States (0.04)
- Europe
- Genre:
- Research Report > New Finding (1.00)
- Industry:
- Automobiles & Trucks (1.00)
- Energy > Renewable (0.46)
- Health & Medicine (0.67)
- Technology: