A RAG-based Question Answering System Proposal for Understanding Islam: MufassirQAS LLM

📅 2024-01-27

🏛️ Social Science Research Network

📈 Citations: 8

✨ Influential: 0

career value

205K/year

🤖 AI Summary

To address frequent issues—including misinterpretation, offense, hallucination, and lack of verifiable sourcing—in religious (particularly Islamic) question-answering, this paper proposes MufassirQAS, a trustworthy religious QA system. It constructs a fine-grained vector database (using FAISS/Chroma) from Turkish-language Islamic canonical texts and integrates religion-aware prompt engineering with precise, multi-source textual citations (including page numbers and entry identifiers) within a RAG framework to enhance the factual accuracy and interpretability of LLMs (e.g., Llama and GPT series). Its key innovation lies in the first systematic integration of religious semantic constraints, transparent retrieval evidence, and fine-grained bibliographic provenance into the RAG pipeline. Experiments demonstrate that MufassirQAS achieves 100% citation traceability and zero offensive outputs on sensitive religious queries, significantly outperforming ChatGPT in both factual accuracy and user trust.

Technology Category

Application Category

📝 Abstract

Challenges exist in learning and understanding religions, such as the complexity and depth of religious doctrines and teachings. Chatbots as question-answering systems can help in solving these challenges. LLM chatbots use NLP techniques to establish connections between topics and accurately respond to complex questions. These capabilities make it perfect for enlightenment on religion as a question-answering chatbot. However, LLMs also tend to generate false information, known as hallucination. Also, the chatbots' responses can include content that insults personal religious beliefs, interfaith conflicts, and controversial or sensitive topics. It must avoid such cases without promoting hate speech or offending certain groups of people or their beliefs. This study uses a vector database-based Retrieval Augmented Generation (RAG) approach to enhance the accuracy and transparency of LLMs. Our question-answering system is called"MufassirQAS". We created a database consisting of several open-access books that include Turkish context. These books contain Turkish translations and interpretations of Islam. This database is utilized to answer religion-related questions and ensure our answers are trustworthy. The relevant part of the dataset, which LLM also uses, is presented along with the answer. We have put careful effort into creating system prompts that give instructions to prevent harmful, offensive, or disrespectful responses to respect people's values and provide reliable results. The system answers and shares additional information, such as the page number from the respective book and the articles referenced for obtaining the information. MufassirQAS and ChatGPT are also tested with sensitive questions. We got better performance with our system. Study and enhancements are still in progress. Results and future works are given.

Problem

Research questions and friction points this paper is trying to address.

Addresses challenges in understanding complex religious doctrines using chatbots.

Mitigates LLM hallucination and offensive content in religious question-answering systems.

Enhances accuracy and transparency in Islamic teachings via RAG-based MufassirQAS.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Retrieval Augmented Generation (RAG) for accuracy

Incorporates Turkish Islamic texts for context

Prevents offensive responses with system prompts

🔎 Similar Papers

No similar papers found.