OnPrem.LLM: A Privacy-Conscious Document Intelligence Toolkit

Sep-30-2025–arXiv.org Artificial Intelligence

We present OnPrem$.$LLM, a Python-based toolkit for applying large language models (LLMs) to sensitive, non-public data in offline or restricted environments. The system is designed for privacy-preserving use cases and provides prebuilt pipelines for document processing and storage, retrieval-augmented generation (RAG), information extraction, summarization, classification, and prompt/output processing with minimal configuration. OnPrem$.$LLM supports multiple LLM backends -- including llama$.$cpp, Ollama, vLLM, and Hugging Face Transformers -- with quantized model support, GPU acceleration, and seamless backend switching. Although designed for fully local execution, OnPrem$.$LLM also supports integration with a wide range of cloud LLM providers when permitted, enabling hybrid deployments that balance performance with data control. A no-code web interface extends accessibility to non-technical users.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Sep-30-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - North Dakota > Burke County (0.04)
  - Virginia > Alexandria County
    - Alexandria (0.04)

Genre:
- Research Report (0.43)

Industry:
- Information Technology > Services (0.30)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.70)
  - Natural Language > Large Language Model (1.00)