🤖 AI Summary
Large language models (LLMs) suffer from hallucination, lack of interpretability, and difficulty in formal verification—critical limitations for building reliable expert systems.
Method: We propose a tightly integrated neuro-symbolic architecture that combines generative AI with symbolic reasoning. Using structured prompt engineering, LLMs (e.g., Claude Sonnet 3.7, GPT-4.1) extract domain-specific knowledge within constrained scopes and automatically compile it into executable, verifiable Prolog rules; human-in-the-loop validation ensures knowledge fidelity.
Contribution/Results: Our approach synergistically unites LLMs’ high recall with symbolic systems’ determinism, transparency, and formal verifiability—yielding high factual accuracy, interpretable and traceable inference, and controllable, correctable system behavior. Experiments demonstrate strong factual precision, semantic coherence, scalability, and suitability for deployment in safety-critical or sensitive domains.
📝 Abstract
The development of large language models (LLMs) has successfully transformed knowledge-based systems such as open domain question nswering, which can automatically produce vast amounts of seemingly coherent information. Yet, those models have several disadvantages like hallucinations or confident generation of incorrect or unverifiable facts. In this paper, we introduce a new approach to the development of expert systems using LLMs in a controlled and transparent way. By limiting the domain and employing a well-structured prompt-based extraction approach, we produce a symbolic representation of knowledge in Prolog, which can be validated and corrected by human experts. This approach also guarantees interpretability, scalability and reliability of the developed expert systems. Via quantitative and qualitative experiments with Claude Sonnet 3.7 and GPT-4.1, we show strong adherence to facts and semantic coherence on our generated knowledge bases. We present a transparent hybrid solution that combines the recall capacity of LLMs with the precision of symbolic systems, thereby laying the foundation for dependable AI applications in sensitive domains.