🤖 AI Summary
Large-scale scientific measurements powered by AI often suffer from estimation bias and lack statistical guarantees due to model imperfections. To address this, we propose an active measurement framework that integrates AI-driven prediction with importance sampling, iteratively selecting high-information samples for human annotation via a human-in-the-loop feedback loop. We design a novel weighted estimator and a Monte Carlo–based confidence interval construction method, ensuring unbiased and high-precision total estimates—even when the underlying AI model is biased. The framework unifies online learning with human-AI collaboration. Evaluated on multiple real-world measurement tasks, it significantly reduces estimation error while cutting human annotation costs by 30–70%. Moreover, it improves statistical reliability and confidence interval coverage. Our approach establishes a verifiable, scalable paradigm for AI-augmented scientific measurement.
📝 Abstract
AI has the potential to transform scientific discovery by analyzing vast datasets with little human effort. However, current workflows often do not provide the accuracy or statistical guarantees that are needed. We introduce active measurement, a human-in-the-loop AI framework for scientific measurement. An AI model is used to predict measurements for individual units, which are then sampled for human labeling using importance sampling. With each new set of human labels, the AI model is improved and an unbiased Monte Carlo estimate of the total measurement is refined. Active measurement can provide precise estimates even with an imperfect AI model, and requires little human effort when the AI model is very accurate. We derive novel estimators, weighting schemes, and confidence intervals, and show that active measurement reduces estimation error compared to alternatives in several measurement tasks.