A Causal Framework for Estimating Heterogeneous Effects of On-Demand Tutoring

📅 2026-02-22

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study evaluates the immediate causal effects of on-demand human tutoring within an adaptive learning system, addressing estimation biases arising from student self-selection into help-seeking and dynamic knowledge states. By modeling students’ latent proficiency using Deep Knowledge Tracing (DKT) and integrating it with a doubly robust causal forest estimator, the authors conduct a heterogeneous treatment effect analysis across more than 5,000 middle school mathematics tutoring sessions. This work presents the first integration of DKT with causal forests, enabling scalable, robust, and fine-grained causal inference. Results indicate that tutoring increases the probability of correctly answering the next problem by an average of 4 percentage points and improves subsequent skill accuracy by approximately 3 percentage points. Individual treatment effects are highly heterogeneous, ranging from −20.25 to +19.91 percentage points, with significantly larger benefits observed for students with lower prior mastery.

Technology Category

Application Category

📝 Abstract

This paper introduces a scalable causal inference framework for estimating the immediate, session-level effects of on-demand human tutoring embedded within adaptive learning systems. Because students seek assistance at moments of difficulty, conventional evaluation is confounded by self-selection and time-varying knowledge states. We address these challenges by integrating principled analytic sample construction with Deep Knowledge Tracing (DKT) to estimate latent mastery, followed by doubly robust estimation using Causal Forests. Applying this framework to over 5,000 middle-school mathematics tutoring sessions, we find that requesting human tutoring increases next-problem correctness by approximately 4 percentage points and accuracy on the subsequent skill encountered by approximately 3 percentage points, suggesting that the effects of tutoring have proximal transfer across knowledge components. This effect is robust to various forms of model specification and potential unmeasured confounders. Notably, these effects exhibit significant heterogeneity across sessions and students, with session-level effect estimates ranging from $-20.25pp$ to $+19.91pp$. Our follow-up analyses suggest that typical behavioral indicators, such as student talk time, do not consistently correlate with high-impact sessions. Furthermore, treatment effects are larger for students with lower prior mastery and slightly smaller for low-SES students. This framework offers a rigorous, practical template for the evaluation and continuous improvement of on-demand human tutoring, with direct applications for emerging AI tutoring systems.

Problem

Research questions and friction points this paper is trying to address.

causal inference

heterogeneous treatment effects

on-demand tutoring

self-selection bias

adaptive learning systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

causal inference

heterogeneous treatment effects

Deep Knowledge Tracing

Causal Forests

on-demand tutoring

🔎 Similar Papers

No similar papers found.

Authors to Follow