Investigating the Role of LLMs Hyperparameter Tuning and Prompt Engineering to Support Domain Modeling

📅 2025-07-19

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address the limited accuracy of general-purpose large language models (LLMs) in domain-specific modeling, this paper proposes a lightweight, fine-tuning-free optimization framework tailored for Llama 3.1 to enhance its capability in automatically generating domain models—particularly medical data models—from natural language descriptions. Methodologically, the framework integrates search-based hyperparameter tuning with structured prompt engineering: it systematically optimizes inference parameters (e.g., temperature, top-p, max_tokens) and employs stepwise, domain-aware prompt templates to improve output controllability and semantic consistency. By avoiding parameter updates, the approach eliminates the computational overhead and catastrophic forgetting associated with fine-tuning. Experiments across ten heterogeneous domains demonstrate substantial improvements, especially in healthcare (+23.6% F1 over baselines), alongside robust cross-domain generalization—validating both effectiveness and practical applicability.

Technology Category

Application Category

📝 Abstract

The introduction of large language models (LLMs) has enhanced automation in software engineering tasks, including in Model Driven Engineering (MDE). However, using general-purpose LLMs for domain modeling has its limitations. One approach is to adopt fine-tuned models, but this requires significant computational resources and can lead to issues like catastrophic forgetting. This paper explores how hyperparameter tuning and prompt engineering can improve the accuracy of the Llama 3.1 model for generating domain models from textual descriptions. We use search-based methods to tune hyperparameters for a specific medical data model, resulting in a notable quality improvement over the baseline LLM. We then test the optimized hyperparameters across ten diverse application domains. While the solutions were not universally applicable, we demonstrate that combining hyperparameter tuning with prompt engineering can enhance results across nearly all examined domain models.

Problem

Research questions and friction points this paper is trying to address.

Improving domain model accuracy with hyperparameter tuning

Optimizing LLMs for specific medical data modeling

Enhancing diverse domain models via prompt engineering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hyperparameter tuning improves Llama 3.1 accuracy

Prompt engineering enhances domain model generation

Search-based methods optimize medical data modeling

🔎 Similar Papers

No similar papers found.