🤖 AI Summary
In social networks, the influence probability of peer recommendations is highly context-dependent—varying with sender, receiver, relationship, item, and medium—resulting in strong heterogeneity and estimation difficulty. Static data capture only correlations, while conventional online learning methods either sacrifice efficiency via blind exploration or bias estimation toward high-influence regions, compromising accuracy.
Method: We propose an uncertainty-guided exploration algorithm grounded in linear contextual bandits, explicitly balancing regret minimization against influence probability estimation error through a tunable trade-off parameter. Our approach integrates dynamic intervention and contextual modeling to adaptively refine estimates.
Contribution/Results: Evaluated on semi-synthetic network data, our method significantly outperforms static models and baselines that ignore the error–regret trade-off. It establishes a new paradigm for information diffusion modeling and viral marketing optimization—one that jointly ensures statistical estimation accuracy and sequential decision-making efficiency.
📝 Abstract
In networked environments, users frequently share recommendations about content, products, services, and courses of action with others. The extent to which such recommendations are successful and adopted is highly contextual, dependent on the characteristics of the sender, recipient, their relationship, the recommended item, and the medium, which makes peer influence probabilities highly heterogeneous. Accurate estimation of these probabilities is key to understanding information diffusion processes and to improving the effectiveness of viral marketing strategies. However, learning these probabilities from data is challenging; static data may capture correlations between peer recommendations and peer actions but fails to reveal influence relationships. Online learning algorithms can learn these probabilities from interventions but either waste resources by learning from random exploration or optimize for rewards, thus favoring exploration of the space with higher influence probabilities. In this work, we study learning peer influence probabilities under a contextual linear bandit framework. We show that a fundamental trade-off can arise between regret minimization and estimation error, characterize all achievable rate pairs, and propose an uncertainty-guided exploration algorithm that, by tuning a parameter, attains any pair within this trade-off. Our experiments on semi-synthetic network datasets show the advantages of our method over static methods and contextual bandits that ignore this trade-off.