Porting an LLM based Application from ChatGPT to an On-Premise Environment

📅 2025-04-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

The AIPA procurement evaluation system faces critical challenges—including data privacy risks, regulatory noncompliance, and opaque decision-making—when deployed on public cloud infrastructure. Method: This paper proposes the first full-stack, on-premises migration framework tailored for LLM-based applications. It integrates open-weight foundation models (Llama/Mistral), lightweight inference engines (vLLM/Ollama), a private Kubernetes cluster, and an end-to-end data anonymization and audit toolkit to ensure regulatory compliance, model interpretability, and cost efficiency. Contribution/Results: The framework enables successful local deployment with sub-800ms average inference latency, achieves a 37% reduction in hardware expenditure, and passes preliminary compliance assessments against GDPR and ISO/IEC 27001. This work establishes a reusable methodology and engineering paradigm for deploying LLMs in high-sensitivity domains requiring strict data sovereignty and transparency.

Technology Category

Application Category

📝 Abstract

Given the data-intensive nature of Machine Learning (ML) systems in general, and Large Language Models (LLM) in particular, using them in cloud based environments can become a challenge due to legislation related to privacy and security of data. Taking such aspects into consideration implies porting the LLMs to an on-premise environment, where privacy and security can be controlled. In this paper, we study this porting process of a real-life application using ChatGPT, which runs in a public cloud, to an on-premise environment. The application being ported is AIPA, a system that leverages Large Language Models (LLMs) and sophisticated data analytics to enhance the assessment of procurement call bids. The main considerations in the porting process include transparency of open source models and cost of hardware, which are central design choices of the on-premise environment. In addition to presenting the porting process, we evaluate downsides and benefits associated with porting.

Problem

Research questions and friction points this paper is trying to address.

Porting LLM applications from cloud to on-premise for data privacy

Addressing challenges in open-source model transparency and hardware costs

Evaluating trade-offs in migrating ChatGPT-based systems to on-premise

Innovation

Methods, ideas, or system contributions that make the work stand out.

Porting LLM from cloud to on-premise environment

Ensuring data privacy and security control

Evaluating open source models and hardware costs

🔎 Similar Papers

No similar papers found.

Authors to Follow