Feature leakage and the identifiability of direct-dependency entropy models of neural activity

📅 2026-06-01
📈 Citations: 0
Influential: 0
📄 PDF

career value

214K/year
🤖 AI Summary
This study addresses a critical limitation of maximum entropy models in neural activity modeling: their tendency to misattribute higher-order statistical effects to first-order mechanisms due to biases in input distributions, thereby conflating predictive performance with genuine computational rules. To disentangle distribution-dependent prediction from mechanistic identifiability, the authors introduce a suite of diagnostic approaches—state reweighting, conditional log-odds contrast, and temporal leakage control—combined with constrained maximum entropy modeling, information projection, and coskewness analysis. Simulations demonstrate that purely higher-order responses can pass first-order tests under the original distribution yet are correctly identified after reweighting. Applied to CA1 hippocampal data, approximately half of the apparently first-order units exhibit distribution sensitivity upon reweighting, significantly exceeding expectations from an additive null model, thus revealing the limitations of interpreting neural coding solely through entropy maximization.
📝 Abstract
Biological neurons receive thousands of synaptic inputs on branching, electrically excitable dendrites, yet population activity is often modeled with direct input-output rules in which each input contributes independently to a scalar drive. We study what successful prediction by such models does, and does not, reveal about neural computation. For conditional maximum-entropy models that match output rates and pairwise output-input coactivities, the entropy explained by a direct model is a prediction measure under the sampled input distribution, not a mechanism-identification test. A restricted MaxEnt fit is an information projection: omitted interaction, temporal, or hidden-state terms can be absorbed into fitted first-order parameters whenever they are correlated with the included sufficient statistics. For sparse correlated binary inputs, this absorption has an explicit coskewness form. We introduce diagnostics that separate in-distribution prediction from recovery of the response rule: state reweighting that holds P(y|x) fixed while changing P(x), conditional log-odds contrasts for local additivity, and temporal leakage controls. In ground-truth simulations, purely higher-order responses can pass first-order entropy and raw coactivity tests under leakage-prone sampling, but are correctly classified after reweighting. Applied to selected, leakage-enriched local tables from CA1 hippocampal recordings, approximately half of tables that appear first-order under empirical weights become distribution-sensitive under balanced reweighting, far above a matched additive-surrogate null. Thus direct entropy-explained fractions and raw coactivity predictions should be interpreted as predictions under the observed state distribution, not as evidence that mechanisms outside the direct model are absent or small.
Problem

Research questions and friction points this paper is trying to address.

feature leakage
maximum entropy models
neural computation
input-output modeling
identifiability
Innovation

Methods, ideas, or system contributions that make the work stand out.

feature leakage
maximum entropy models
state reweighting
conditional log-odds
neural identifiability