🤖 AI Summary
Identifying small and medium-sized enterprises (SMEs) with high growth potential remains a significant challenge. This work proposes SME-HGT, the first application of heterogeneous graph neural networks to this task, by constructing a heterogeneous graph encompassing enterprises, research topics, and government agencies. The model leverages only publicly available data to predict whether Phase I awardees of the Small Business Innovation Research (SBIR) program will advance to Phase II. Key innovations include a temporally aware evaluation protocol designed to prevent data leakage, the incorporation of multi-relational edges to capture complex interactions, and a fully reproducible methodology. On a temporal test set, SME-HGT achieves an AUPRC of 0.621, substantially outperforming MLP and R-GCN baselines. When selecting the top 100 firms, it attains a precision of 89.6%, more than 2.14 times higher than random selection.
📝 Abstract
Small and Medium Enterprises (SMEs) constitute 99.9% of U.S. businesses and generate 44% of economic activity, yet systematically identifying high-potential SMEs remains an open challenge. We introduce SME-HGT, a Heterogeneous Graph Transformer framework that predicts which SBIR Phase I awardees will advance to Phase II funding using exclusively public data. We construct a heterogeneous graph with 32,268 company nodes, 124 research topic nodes, and 13 government agency nodes connected by approximately 99,000 edges across three semantic relation types. SME-HGT achieves an AUPRC of 0.621 0.003 on a temporally-split test set, outperforming an MLP baseline (0.590 0.002) and R-GCN (0.608 0.013) across five random seeds. At a screening depth of 100 companies, SME-HGT attains 89.6% precision with a 2.14 lift over random selection. Our temporal evaluation protocol prevents information leakage, and our reliance on public data ensures reproducibility. These results demonstrate that relational structure among firms, research topics, and funding agencies provides meaningful signal for SME potential assessment, with implications for policymakers and early-stage investors.