LLM-powered Real-time Patent Citation Recommendation for Financial Technologies

📅 2026-01-23

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This study addresses the challenge of real-time citation recommendation posed by the rapid growth of financial patents, where traditional static or periodically updated methods struggle to support dynamic prior art discovery efficiently. To this end, we propose a real-time citation recommendation framework tailored for financial patents. The approach first leverages large language models to generate semantic embeddings of patents and then employs an HNSW graph-based incremental indexing strategy to enable real-time insertion of new patents without full index reconstruction. Top-k recommendations are efficiently retrieved through approximate nearest neighbor search combined with semantic similarity ranking. Experimental evaluation on a dataset of 428,843 Chinese financial patents demonstrates that our method outperforms conventional text-based baselines and other nearest neighbor retrieval approaches in both precision and recall, while daily incremental updates substantially reduce computational overhead.

Technology Category

Application Category

📝 Abstract

Rapid financial innovation has been accompanied by a sharp increase in patenting activity, making timely and comprehensive prior-art discovery more difficult. This problem is especially evident in financial technologies, where innovations develop quickly, patent collections grow continuously, and citation recommendation systems must be updated as new applications arrive. Existing patent retrieval and citation recommendation methods typically rely on static indexes or periodic retraining, which limits their ability to operate effectively in such dynamic settings. In this study, we propose a real-time patent citation recommendation framework designed for large and fast-changing financial patent corpora. Using a dataset of 428,843 financial patents granted by the China National Intellectual Property Administration (CNIPA) between 2000 and 2024, we build a three-stage recommendation pipeline. The pipeline uses large language model (LLM) embeddings to represent the semantic content of patent abstracts, applies efficient approximate nearest-neighbor search to construct a manageable candidate set, and ranks candidates by semantic similarity to produce top-k citation recommendations. In addition to improving recommendation accuracy, the proposed framework directly addresses the dynamic nature of patent systems. By using an incremental indexing strategy based on hierarchical navigable small-world (HNSW) graphs, newly issued patents can be added without rebuilding the entire index. A rolling day-by-day update experiment shows that incremental updating improves recall while substantially reducing computational cost compared with rebuild-based indexing. The proposed method also consistently outperforms traditional text-based baselines and alternative nearest-neighbor retrieval approaches.

Problem

Research questions and friction points this paper is trying to address.

patent citation recommendation

financial technologies

real-time updating

dynamic patent corpora

prior-art discovery

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM embeddings

real-time citation recommendation

incremental indexing