🤖 AI Summary
This work addresses the lack of theoretical guarantees in bipartite network model selection and the tendency of existing methods to underfit one node set while overfitting the other due to inherent asymmetry. To this end, we propose a penalized cross-validation framework tailored for bipartite stochastic block models. By designing model-specific penalty terms that account for structural asymmetry between the two node sets, our approach effectively preserves their latent heterogeneous organization. As the first model selection framework for bipartite networks with provable consistency guarantees, it overcomes the implicit symmetry assumptions underlying conventional strategies such as projection-based methods or modularity maximization. Extensive experiments demonstrate that our method consistently outperforms state-of-the-art alternatives across diverse synthetic scenarios and two real-world datasets.
📝 Abstract
Although network data have become increasingly popular and widely studied, the vast majority of statistical literature has focused on unipartite networks, leaving relatively few theoretical results for bipartite networks. In this paper, we study the model selection problem for bipartite stochastic block models. We propose a penalized cross-validation approach that incorporates appropriate penalty terms for different candidate models, addressing the new and challenging issue that underfitting may occur on one side while overfitting occurs on the other. To the best of our knowledge, our method provides the first consistency guarantee for model selection in bipartite networks. Through simulations under various scenarios and analysis of two real datasets, we demonstrate that our approach not only outperforms traditional modularity-based and projection-based methods, but also naturally preserves potential asymmetry between the two node sets.