🤖 AI Summary
This study addresses the non-random coarsening inherent in RNA-seq count data, which arises from ambiguous read alignments and violates the conventional ignorability assumption. To tackle this issue, the authors propose a fuzzy reporting mechanism based on graded membership and develop a hierarchical Bayesian model that explicitly captures this non-ignorable granularity structure. This work is the first to formally frame RNA-seq count ambiguity as a non-random coarsening problem, demonstrating that standard ignorability assumptions generally fail in real-world data. Using empirical RNA-seq datasets, the proposed model effectively quantifies the impact of such ambiguity on downstream statistical inference, offering a principled approach to account for alignment uncertainty in transcriptomic analyses.
📝 Abstract
RNA-seq count data are often affected by read-to-gene alignment ambiguity, especially in high-dimensional transcriptomics. This type of ambiguity can be conveniently expressed through granular counts, namely fuzzy-valued observations of latent discrete quantities. We study a class of fuzzy-reporting mechanisms and show that, when reporting exploits graded membership, ignorability fails generically, leading to a coarsening-not-at-random structure. A hierarchical model is then introduced as a tractable instance of this construction and illustrated using RNA-seq data.