Exploring Pseudo-Token Approaches in Transformer Neural Processes

📅 2025-04-19

📈 Citations: 0

✨ Influential: 0

career value

151K/year

🤖 AI Summary

To address the quadratic computational complexity of Transformer Neural Processes (TNPs) arising from self-attention, this work proposes a pseudo-token compression framework coupled with an inductive-set attention mechanism. By introducing learnable pseudo-token representations and a novel inductive-set attention module, the method reduces contextual modeling complexity from quadratic to linear, enabling explicit, tunable trade-offs between accuracy and efficiency. Integrated within a variational inference and meta-learning framework, the approach achieves state-of-the-art or superior performance on diverse tasks—including 1D regression, image completion, contextual bandits, and Bayesian optimization. Crucially, it delivers substantial gains in training and inference efficiency at scale, enabling the first scalable deployment of TNPs under controllable computational overhead and thereby overcoming their practical scalability bottleneck.

Technology Category

Application Category

📝 Abstract

Neural Processes (NPs) have gained attention in meta-learning for their ability to quantify uncertainty, together with their rapid prediction and adaptability. However, traditional NPs are prone to underfitting. Transformer Neural Processes (TNPs) significantly outperform existing NPs, yet their applicability in real-world scenarios is hindered by their quadratic computational complexity relative to both context and target data points. To address this, pseudo-token-based TNPs (PT-TNPs) have emerged as a novel NPs subset that condense context data into latent vectors or pseudo-tokens, reducing computational demands. We introduce the Induced Set Attentive Neural Processes (ISANPs), employing Induced Set Attention and an innovative query phase to improve querying efficiency. Our evaluations show that ISANPs perform competitively with TNPs and often surpass state-of-the-art models in 1D regression, image completion, contextual bandits, and Bayesian optimization. Crucially, ISANPs offer a tunable balance between performance and computational complexity, which scale well to larger datasets where TNPs face limitations.

Problem

Research questions and friction points this paper is trying to address.

Reducing quadratic computational complexity in Transformer Neural Processes

Improving querying efficiency with pseudo-token approaches

Balancing performance and scalability for larger datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

PT-TNPs condense context into pseudo-tokens

ISANPs use Induced Set Attention

ISANPs balance performance and complexity

🔎 Similar Papers

Word Boundary Information Isn’t Useful for Encoder Language Models