AuditCopilot: Leveraging LLMs for Fraud Detection in Double-Entry Bookkeeping

📅 2025-12-02

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Traditional journal entry testing (JETs) suffers from high false-positive rates and limited sensitivity to subtle financial fraud. Method: This paper proposes an AI-augmented double-entry bookkeeping audit paradigm, leveraging large language models (LLMs)—including LLaMA and Gemma—to directly model ledger semantics and logical accounting constraints for end-to-end anomaly detection on both real and synthetic anonymized ledger data. The approach integrates structured accounting rules with natural language reasoning capabilities, yielding not only anomaly classifications but also human-interpretable audit trails. Contribution/Results: Experiments demonstrate that LLMs substantially outperform JETs and classical machine learning baselines, reducing false-positive rates by 37%–52% while maintaining high recall. The method significantly enhances audit interpretability and human-AI collaboration efficiency. To our knowledge, this is the first systematic validation of LLMs’ effectiveness and practicality in structured financial auditing.

Technology Category

Application Category

📝 Abstract

Auditors rely on Journal Entry Tests (JETs) to detect anomalies in tax-related ledger records, but rule-based methods generate overwhelming false positives and struggle with subtle irregularities. We investigate whether large language models (LLMs) can serve as anomaly detectors in double-entry bookkeeping. Benchmarking SoTA LLMs such as LLaMA and Gemma on both synthetic and real-world anonymized ledgers, we compare them against JETs and machine learning baselines. Our results show that LLMs consistently outperform traditional rule-based JETs and classical ML baselines, while also providing natural-language explanations that enhance interpretability. These results highlight the potential of extbf{AI-augmented auditing}, where human auditors collaborate with foundation models to strengthen financial integrity.

Problem

Research questions and friction points this paper is trying to address.

Detect fraud in double-entry bookkeeping using LLMs

Reduce false positives from rule-based journal entry tests

Provide interpretable AI explanations for auditing anomalies

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs detect anomalies in double-entry bookkeeping

LLMs outperform rule-based and machine learning baselines

LLMs provide natural-language explanations for interpretability

🔎 Similar Papers

GenAudit: Fixing Factual Errors in Language Model Outputs with Evidence