LLM-powered Real-time Patent Citation Recommendation for Financial Technologies

📅 2026-01-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of real-time citation recommendation posed by the rapid growth of financial patents, where traditional static or periodically updated methods struggle to support dynamic prior art discovery efficiently. To this end, we propose a real-time citation recommendation framework tailored for financial patents. The approach first leverages large language models to generate semantic embeddings of patents and then employs an HNSW graph-based incremental indexing strategy to enable real-time insertion of new patents without full index reconstruction. Top-k recommendations are efficiently retrieved through approximate nearest neighbor search combined with semantic similarity ranking. Experimental evaluation on a dataset of 428,843 Chinese financial patents demonstrates that our method outperforms conventional text-based baselines and other nearest neighbor retrieval approaches in both precision and recall, while daily incremental updates substantially reduce computational overhead.

Technology Category

Application Category

📝 Abstract
Rapid financial innovation has been accompanied by a sharp increase in patenting activity, making timely and comprehensive prior-art discovery more difficult. This problem is especially evident in financial technologies, where innovations develop quickly, patent collections grow continuously, and citation recommendation systems must be updated as new applications arrive. Existing patent retrieval and citation recommendation methods typically rely on static indexes or periodic retraining, which limits their ability to operate effectively in such dynamic settings. In this study, we propose a real-time patent citation recommendation framework designed for large and fast-changing financial patent corpora. Using a dataset of 428,843 financial patents granted by the China National Intellectual Property Administration (CNIPA) between 2000 and 2024, we build a three-stage recommendation pipeline. The pipeline uses large language model (LLM) embeddings to represent the semantic content of patent abstracts, applies efficient approximate nearest-neighbor search to construct a manageable candidate set, and ranks candidates by semantic similarity to produce top-k citation recommendations. In addition to improving recommendation accuracy, the proposed framework directly addresses the dynamic nature of patent systems. By using an incremental indexing strategy based on hierarchical navigable small-world (HNSW) graphs, newly issued patents can be added without rebuilding the entire index. A rolling day-by-day update experiment shows that incremental updating improves recall while substantially reducing computational cost compared with rebuild-based indexing. The proposed method also consistently outperforms traditional text-based baselines and alternative nearest-neighbor retrieval approaches.
Problem

Research questions and friction points this paper is trying to address.

patent citation recommendation
financial technologies
real-time updating
dynamic patent corpora
prior-art discovery
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM embeddings
real-time citation recommendation
incremental indexing
HNSW graphs
financial patents
🔎 Similar Papers
No similar papers found.
T
Tianang Deng
School of Statistics and Mathematics, Central University of Finance and Economics, 39 South College Road, Beijing, 100081, Beijing, China
Y
Yu Deng
Harbin Huiwen JetCreate Artificial Intelligence Technology Co., Ltd., 288 Zhigu Avenue, Harbin, 150020, Heilongjiang, China
Tianchen Gao
Tianchen Gao
Peking University
Community DetectionComplex Network
Y
Yonghong Hu
School of Statistics and Mathematics, Central University of Finance and Economics, 39 South College Road, Beijing, 100081, Beijing, China
Rui Pan
Rui Pan
Central University of Finance and Economics
social networks