Estimating Causal Effects in Networks with Cluster-Based Bandits

📅 2025-05-07

📈 Citations: 0

✨ Influential: 0

career value

223K/year

🤖 AI Summary

In social networks, interference—where treatments assigned to one user affect outcomes of others—induces substantial bias in causal effect estimation and high policy regret in conventional randomized controlled trials (RCTs) or A/B tests. Method: This paper proposes a community-aware online multi-armed bandit algorithm that explicitly incorporates network structural priors (i.e., community partitions) into the bandit framework. It enables efficient online decision-making while preserving causal identifiability through cluster-aware action selection and semi-synthetic interference modeling. Contribution/Results: In simulation experiments, the proposed algorithm reduces treatment effect estimation error by 37% compared to structure-agnostic baseline bandit methods, and improves the reward-to-action ratio by 21% over RCTs. By jointly optimizing for high-fidelity causal estimation and cumulative reward maximization, it effectively alleviates the bias–reward trade-off inherent in interference-prone settings.

Technology Category

Application Category

📝 Abstract

The gold standard for estimating causal effects is randomized controlled trial (RCT) or A/B testing where a random group of individuals from a population of interest are given treatment and the outcome is compared to a random group of individuals from the same population. However, A/B testing is challenging in the presence of interference, commonly occurring in social networks, where individuals can impact each others outcome. Moreover, A/B testing can incur a high performance loss when one of the treatment arms has a poor performance and the test continues to treat individuals with it. Therefore, it is important to design a strategy that can adapt over time and efficiently learn the total treatment effect in the network. We introduce two cluster-based multi-armed bandit (MAB) algorithms to gradually estimate the total treatment effect in a network while maximizing the expected reward by making a tradeoff between exploration and exploitation. We compare the performance of our MAB algorithms with a vanilla MAB algorithm that ignores clusters and the corresponding RCT methods on semi-synthetic data with simulated interference. The vanilla MAB algorithm shows higher reward-action ratio at the cost of higher treatment effect error due to undesired spillover. The cluster-based MAB algorithms show higher reward-action ratio compared to their corresponding RCT methods without sacrificing much accuracy in treatment effect estimation.

Problem

Research questions and friction points this paper is trying to address.

Estimating causal effects in networks with interference

Balancing exploration-exploitation in adaptive treatment strategies

Reducing spillover errors in cluster-based network experiments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cluster-based MAB algorithms for network effects

Tradeoff between exploration and exploitation

Higher reward-action ratio with accuracy

🔎 Similar Papers

No similar papers found.

Amazon

Arlington, VA, USA / Bellevue, WA, USA / Boston, MA, USA

Research Engineer, Monetization AI