🤖 AI Summary
To address the challenge of keyword search over structured deep web data—particularly restricted tables—that are inherently inaccessible to conventional keyword-based retrieval, this paper proposes a keyword query modeling framework tailored for the deep web. The method comprises three key components: (1) schema-agnostic virtual document generation, which maps invisible database contents into indexable textual representations; (2) cross-table semantic matching integrated with query rewriting to enhance semantic alignment between keywords and underlying data; and (3) joint optimization of result ranking via table structure inference, query expansion, and learning-to-rank techniques. Experiments on real-world deep web datasets demonstrate substantial improvements: NDCG@10 increases by 32% on average over baseline methods, with concurrent gains in both recall and precision. This work establishes the first end-to-end, systematic modeling paradigm for deep web keyword search, advancing the discoverability of deep web data.