Science Out of Its Ivory Tower: Improving Accessibility with Reinforcement Learning

📅 2024-10-22
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Scientific papers impede public comprehension and knowledge dissemination due to dense terminology and syntactically complex structures. To address this, we propose a reinforcement learning–based approach for popularizing academic abstracts, featuring a novel dual-granularity accessibility reward mechanism—operating at both the lexical (term simplification) and syntactic (sentence clarity) levels—and incorporating factual consistency constraints. Compared with supervised fine-tuning and conventional readability-metric–guided methods, our approach achieves superior readability improvements across approximately six U.S. grade levels (e.g., from graduate-level to high-school-level readability), with an average gain of 90%. Experiments demonstrate that rewritten abstracts substantially enhance abstract readability while preserving factual accuracy and linguistic quality. Our method establishes a scalable, quantitatively evaluable paradigm for democratizing scientific communication.

Technology Category

Application Category

📝 Abstract
A vast amount of scholarly work is published daily, yet much of it remains inaccessible to the general public due to dense jargon and complex language. To address this challenge in science communication, we introduce a reinforcement learning framework that fine-tunes a language model to rewrite scholarly abstracts into more comprehensible versions. Guided by a carefully balanced combination of word- and sentence-level accessibility rewards, our language model effectively substitutes technical terms with more accessible alternatives, a task which models supervised fine-tuned or guided by conventional readability measures struggle to accomplish. Our best model adjusts the readability level of scholarly abstracts by approximately six U.S. grade levels -- in other words, from a postgraduate to a high school level. This translates to roughly a 90% relative boost over the supervised fine-tuning baseline, all while maintaining factual accuracy and high-quality language. An in-depth analysis of our approach shows that balanced rewards lead to systematic modifications in the base model, likely contributing to smoother optimization and superior performance. We envision this work as a step toward bridging the gap between scholarly research and the general public, particularly younger readers and those without a college degree.
Problem

Research questions and friction points this paper is trying to address.

Simplifying scholarly abstracts for public accessibility
Replacing technical terms with easier alternatives
Bridging gap between research and non-expert readers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning for text simplification
Balanced word and sentence rewards
Maintains accuracy while improving readability
🔎 Similar Papers
No similar papers found.
H
Haining Wang
Indiana University, Bloomington, Indiana, USA
J
Jason Clark
Montana State University, Bozeman, Montana, USA
Hannah McKelvey
Hannah McKelvey
Montana State University, Bozeman, Montana, USA
L
Leila Sterman
Montana State University, Bozeman, Montana, USA
Z
Zheng Gao
Ant Group, Sunnyvale, California, USA
Zuoyu Tian
Zuoyu Tian
Macalester College
Computational linguisticsLanguage variation and changeComputational social science
S
Sandra Kübler
Indiana University, Bloomington, Indiana, USA
Xiaozhong Liu
Xiaozhong Liu
School of Informatics and Computing, Indiana University Bloomington
Information RetrievalNatural Language ProcessingDigital LibrarySemantic WebMetadata