Clinical ModernBERT: An efficient and long context encoder for biomedical text

📅 2025-04-04

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

To address the challenge of long-context modeling in biomedical text, this paper introduces the first clinical-text-optimized efficient long-context encoder. Methodologically, we jointly pretrain the model on PubMed, MIMIC-IV, and biomedical ontologies, and for the first time deeply adapt Rotary Position Embedding (RoPE), Flash Attention, and 8K context extension techniques to the biomedical domain within a Transformer architecture—enabling synergistic modeling of long-range semantics and domain-specific medical concepts. Experiments demonstrate significant improvements over state-of-the-art methods across multiple clinical NLP benchmarks. Pretrained weight analysis confirms enhanced cross-sentence semantic capture and higher precision in aligning medical entities and relations. The resulting encoder provides a transferable, domain-specialized representation foundation for long-document understanding and clinical reasoning tasks.

Technology Category

Application Category

📝 Abstract

We introduce Clinical ModernBERT, a transformer based encoder pretrained on large scale biomedical literature, clinical notes, and medical ontologies, incorporating PubMed abstracts, MIMIC IV clinical data, and medical codes with their textual descriptions. Building on ModernBERT the current state of the art natural language text encoder featuring architectural upgrades such as rotary positional embeddings (RoPE), Flash Attention, and extended context length up to 8,192 tokens our model adapts these innovations specifically for biomedical and clinical domains. Clinical ModernBERT excels at producing semantically rich representations tailored for long context tasks. We validate this both by analyzing its pretrained weights and through empirical evaluation on a comprehensive suite of clinical NLP benchmarks.

Problem

Research questions and friction points this paper is trying to address.

Develops Clinical ModernBERT for biomedical text encoding

Adapts advanced NLP techniques for clinical domains

Validates performance on clinical NLP benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer encoder pretrained on biomedical data

Incorporates RoPE and Flash Attention

Extended context length to 8192 tokens

🔎 Similar Papers

No similar papers found.