Classification of kinetic-related injury in hospital triage data using NLP

📅 2025-09-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the classification of kinetic-energy-related injuries in hospital triage texts, tackling three key challenges: data privacy sensitivity, high annotation costs, and constrained edge-computing resources. We propose a lightweight two-stage fine-tuning paradigm: first, preliminary fine-tuning of a pre-trained large language model on 2K open-source samples using GPU; second, efficient secondary adaptation on one thousand anonymized hospital records executed entirely on CPU. This approach eliminates reliance on high-end hardware or extensive expert annotations, substantially lowering deployment barriers. Experimental results demonstrate that the model maintains high classification accuracy under low-resource conditions while ensuring patient data privacy, computational efficiency, and clinical applicability. The method provides a practical, deployable solution for intelligent triage in resource-limited primary healthcare settings.

Technology Category

Application Category

📝 Abstract
Triage notes, created at the start of a patient's hospital visit, contain a wealth of information that can help medical staff and researchers understand Emergency Department patient epidemiology and the degree of time-dependent illness or injury. Unfortunately, applying modern Natural Language Processing and Machine Learning techniques to analyse triage data faces some challenges: Firstly, hospital data contains highly sensitive information that is subject to privacy regulation thus need to be analysed on site; Secondly, most hospitals and medical facilities lack the necessary hardware to fine-tune a Large Language Model (LLM), much less training one from scratch; Lastly, to identify the records of interest, expert inputs are needed to manually label the datasets, which can be time-consuming and costly. We present in this paper a pipeline that enables the classification of triage data using LLM and limited compute resources. We first fine-tuned a pre-trained LLM with a classifier using a small (2k) open sourced dataset on a GPU; and then further fine-tuned the model with a hospital specific dataset of 1000 samples on a CPU. We demonstrated that by carefully curating the datasets and leveraging existing models and open sourced data, we can successfully classify triage data with limited compute resources.
Problem

Research questions and friction points this paper is trying to address.

Classifying kinetic-related injury in triage notes
Overcoming privacy and hardware constraints for NLP
Reducing expert labeling costs with limited resources
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned pre-trained LLM with classifier
Used small open dataset on GPU first
Further fine-tuned with hospital data on CPU
🔎 Similar Papers
No similar papers found.
M
Midhun Shyam
School of Computer, Data and Mathematical Sciences, Western Sydney University, Victoria Rd, 2116, NSW, Australia
Jim Basilakis
Jim Basilakis
Western Sydney University
Health InformaticsSecurity and Applied CryptographyClinical Decision Supportand Telehealth
K
Kieran Luken
School of Computer, Data and Mathematical Sciences, Western Sydney University, Victoria Rd, 2116, NSW, Australia
S
Steven Thomas
South Western Emergency Research Institute, Ingham Institute of Applied Medical Research, 1 Campbell St, Liverpool, 2170, NSW, Australia
J
John Crozier
Liverpool Hospital, Liverpool, 2170 NSW, Australia
P
Paul M. Middleton
South Western Emergency Research Institute, Ingham Institute of Applied Medical Research, 1 Campbell St, Liverpool, 2170, NSW, Australia
X. Rosalind Wang
X. Rosalind Wang
School of Computer, Data and Mathematical Sciences, Western Sydney University, Victoria Rd, 2116, NSW, Australia; South Western Emergency Research Institute, Ingham Institute of Applied Medical Research, 1 Campbell St, Liverpool, 2170, NSW, Australia