Privacy-Preserving Local Language Models for Longitudinal Data Retrieval in Chronic Dermatologic Disease: Implementation in Pemphigus Patients

📅 2026-05-24
📈 Citations: 0
Influential: 0
📄 PDF

career value

191K/year
🤖 AI Summary
This study addresses the challenge of efficiently reviewing extensive longitudinal clinical records in chronic dermatological conditions such as pemphigus, where critical information is often overlooked due to data volume. For the first time in a real-world clinical setting, a localized small language model (Qwen3-4B Thinking 2507) was deployed to automatically extract structured information from 89,336-word longitudinal narratives based on 56 expert-defined clinical features and generate concise summaries. The model achieved an average accuracy of 82.25% across 1,680 feature extraction tasks. Clinicians rated the AI-generated summaries highly, assigning scores above 8 for quality, accuracy, and utility, and preferred them over manual review in 53.3% of cases. These results demonstrate that privacy-preserving, locally deployed small language models can deliver high accuracy and practical value in clinical decision support.
📝 Abstract
Chronic dermatologic diseases such as pemphigus require long-term follow-up, generating extensive longitudinal clinical documentation that is difficult to review comprehensively during routine visits and increasing clinician workload as well as the risk of missing critical historical information. We evaluated whether a locally deployed, privacy-preserving small language model (SLM) could retrieve structured clinical features and generate longitudinal summaries from long-term dermatology follow-up records. In this retrospective case series, thirty pemphigus patients contributed 541 visit notes that were aggregated into full longitudinal records (89,336 words); 56 clinically relevant features were annotated by two expert dermatologists. The locally deployed SLM (Qwen3 4B Thinking 2507) was queried with each complete record to retrieve 56 features and generate one final report summaries. Across 1,680 feature retrieval tasks, mean accuracy was 82.25%. Dermatologists' ratings of AI-generated summaries were high for overall quality (8.23-8.47), clinical accuracy (7.93-8.20), and usefulness (8.47-8.50), with no significant inter-evaluator differences and an overall preference for AI summaries in 53.3% of evaluations. These findings suggest that privacy-preserving, locally deployed SLMs can outperform medical experts and reliably generate clinically meaningful longitudinal summaries. SLMs may support clinical decision-making when integrated with appropriate oversight.
Problem

Research questions and friction points this paper is trying to address.

longitudinal data retrieval
chronic dermatologic disease
clinical documentation overload
privacy-preserving
pemphigus
Innovation

Methods, ideas, or system contributions that make the work stand out.

privacy-preserving
local language model
longitudinal data retrieval
small language model (SLM)
clinical summarization
Abdurrahim Yilmaz
Abdurrahim Yilmaz
Imperial College London
Deep LearningAI for DermatologyMicrorobotics
A
Ayşe Esra Koku Aksu
Department of Dermatology and Venereology, Istanbul Research and Training Hospital
D
Duygu Yamen
Department of Dermatology and Venereology, Istanbul Research and Training Hospital
V
Vefa Asli Erdemir
Department of Dermatology and Venereology, Istanbul Medeniyet University
M
Mehmet Salih Gurel
Department of Dermatology and Venereology, Istanbul Medeniyet University
G
Gulsum Gencoglan
Department of Dermatology and Venereology, Istanbul Medicana Atakoy Hospital
J
Joram M. Posma
Division of Systems Medicine, Department of Metabolism, Digestion and Reproduction, Imperial College London
Burak Temelkuran
Burak Temelkuran
Imperial College London