🤖 AI Summary
To address the challenge of long-context modeling in biomedical text, this paper introduces the first clinical-text-optimized efficient long-context encoder. Methodologically, we jointly pretrain the model on PubMed, MIMIC-IV, and biomedical ontologies, and for the first time deeply adapt Rotary Position Embedding (RoPE), Flash Attention, and 8K context extension techniques to the biomedical domain within a Transformer architecture—enabling synergistic modeling of long-range semantics and domain-specific medical concepts. Experiments demonstrate significant improvements over state-of-the-art methods across multiple clinical NLP benchmarks. Pretrained weight analysis confirms enhanced cross-sentence semantic capture and higher precision in aligning medical entities and relations. The resulting encoder provides a transferable, domain-specialized representation foundation for long-document understanding and clinical reasoning tasks.
📝 Abstract
We introduce Clinical ModernBERT, a transformer based encoder pretrained on large scale biomedical literature, clinical notes, and medical ontologies, incorporating PubMed abstracts, MIMIC IV clinical data, and medical codes with their textual descriptions. Building on ModernBERT the current state of the art natural language text encoder featuring architectural upgrades such as rotary positional embeddings (RoPE), Flash Attention, and extended context length up to 8,192 tokens our model adapts these innovations specifically for biomedical and clinical domains. Clinical ModernBERT excels at producing semantically rich representations tailored for long context tasks. We validate this both by analyzing its pretrained weights and through empirical evaluation on a comprehensive suite of clinical NLP benchmarks.