Loquetier: A Virtualized Multi-LoRA Framework for Unified LLM Fine-tuning and Serving
–Neural Information Processing Systems
Low-Rank Adaptation (LoRA) has become a widely adopted parameter-efficient fine-tuning (PEFT) technique for adapting large language models (LLMs) to downstream tasks. While prior work has explored strategies for integrating LLM training and serving, there still remains a gap in unifying fine-tuning and inference for LoRA-based models.
Neural Information Processing Systems
Jun-12-2026, 10:05:54 GMT
- Technology: