Privacy-Preserving Local Language Models for Longitudinal Data Retrieval in Chronic Dermatologic Disease: Implementation in Pemphigus Patients

📅 2026-05-24

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

This study addresses the challenge of efficiently reviewing extensive longitudinal clinical records in chronic dermatological conditions such as pemphigus, where critical information is often overlooked due to data volume. For the first time in a real-world clinical setting, a localized small language model (Qwen3-4B Thinking 2507) was deployed to automatically extract structured information from 89,336-word longitudinal narratives based on 56 expert-defined clinical features and generate concise summaries. The model achieved an average accuracy of 82.25% across 1,680 feature extraction tasks. Clinicians rated the AI-generated summaries highly, assigning scores above 8 for quality, accuracy, and utility, and preferred them over manual review in 53.3% of cases. These results demonstrate that privacy-preserving, locally deployed small language models can deliver high accuracy and practical value in clinical decision support.

📝 Abstract

Chronic dermatologic diseases such as pemphigus require long-term follow-up, generating extensive longitudinal clinical documentation that is difficult to review comprehensively during routine visits and increasing clinician workload as well as the risk of missing critical historical information. We evaluated whether a locally deployed, privacy-preserving small language model (SLM) could retrieve structured clinical features and generate longitudinal summaries from long-term dermatology follow-up records. In this retrospective case series, thirty pemphigus patients contributed 541 visit notes that were aggregated into full longitudinal records (89,336 words); 56 clinically relevant features were annotated by two expert dermatologists. The locally deployed SLM (Qwen3 4B Thinking 2507) was queried with each complete record to retrieve 56 features and generate one final report summaries. Across 1,680 feature retrieval tasks, mean accuracy was 82.25%. Dermatologists' ratings of AI-generated summaries were high for overall quality (8.23-8.47), clinical accuracy (7.93-8.20), and usefulness (8.47-8.50), with no significant inter-evaluator differences and an overall preference for AI summaries in 53.3% of evaluations. These findings suggest that privacy-preserving, locally deployed SLMs can outperform medical experts and reliably generate clinically meaningful longitudinal summaries. SLMs may support clinical decision-making when integrated with appropriate oversight.

Problem

Research questions and friction points this paper is trying to address.

longitudinal data retrieval

chronic dermatologic disease

clinical documentation overload

privacy-preserving

pemphigus

Innovation

Methods, ideas, or system contributions that make the work stand out.

privacy-preserving

local language model

longitudinal data retrieval