Parse Trees Guided LLM Prompt Compression

📅 2024-09-23
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational overhead and input-length limitations of long-context prompts in large language models (LLMs), as well as the hallucination and loss of global structural coherence caused by existing compression methods, this paper proposes PartPrompt—a hierarchical, selective prompt compression framework grounded in dependency parse trees. Its core innovation lies in the first integration of linguistic priors with global tree-structured modeling, featuring a bidirectional importance propagation mechanism (root-to-leaf and leaf-to-root) and a recursive pruning algorithm that ensures semantic preservation during dynamic compression. Evaluated across multiple datasets, LLMs, and compression ratios, PartPrompt achieves state-of-the-art performance. It significantly improves post-compression prompt coherence and semantic fidelity, demonstrating exceptional robustness—particularly for ultra-long prompts exceeding standard context windows.

Technology Category

Application Category

📝 Abstract
Offering rich contexts to Large Language Models (LLMs) has shown to boost the performance in various tasks, but the resulting longer prompt would increase the computational cost and might exceed the input limit of LLMs. Recently, some prompt compression methods have been suggested to shorten the length of prompts by using language models to generate shorter prompts or by developing computational models to select important parts of original prompt. The generative compression methods would suffer from issues like hallucination, while the selective compression methods have not involved linguistic rules and overlook the global structure of prompt. To this end, we propose a novel selective compression method called PartPrompt. It first obtains a parse tree for each sentence based on linguistic rules, and calculates local information entropy for each node in a parse tree. These local parse trees are then organized into a global tree according to the hierarchical structure such as the dependency of sentences, paragraphs, and sections. After that, the root-ward propagation and leaf-ward propagation are proposed to adjust node values over the global tree. Finally, a recursive algorithm is developed to prune the global tree based on the adjusted node values. The experiments show that PartPrompt receives the state-of-the-art performance across various datasets, metrics, compression ratios, and target LLMs for inference. The in-depth ablation studies confirm the effectiveness of designs in PartPrompt, and other additional experiments also demonstrate its superiority in terms of the coherence of compressed prompts and in the extreme long prompt scenario.
Problem

Research questions and friction points this paper is trying to address.

Reduces long LLM prompts to lower computational costs
Improves prompt compression using linguistic parse trees
Maintains prompt coherence and structure during compression
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parse trees guide prompt compression linguistically
Global tree adjusts node values via propagation
Recursive pruning achieves state-of-the-art compression
🔎 Similar Papers
No similar papers found.
W
Wenhao Mao
Ministry of Education Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing, China
Chengbin Hou
Chengbin Hou
University of Birmingham
Large Language ModelsMachine LearningKnowledge Graphs
T
Tianyu Zhang
Ministry of Education Key Laboratory of Bioinformatics, BNRIST Bioinformatics Division, Department of Automation, Tsinghua University, Beijing, China
Xinyu Lin
Xinyu Lin
National University of Singapore
recommendation
K
Ke Tang
Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China
Hairong Lv
Hairong Lv
Tsinghua University
Knowledge GraphFederated LearningLarge Language ModelMedical and Health