Differentiable Efficient Operator Search

📅 2026-06-02

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Existing token compression methods rely on handcrafted designs, making it challenging to automatically adapt to diverse multimodal tasks and efficiency requirements. This work proposes the first end-to-end differentiable framework for efficient operator search, unifying strategies such as pruning, merging, and pooling into a learnable compression space. The framework jointly optimizes the placement, quantity, and information processing mechanisms of token compression under explicit computational budgets to maximize task performance. Experimental results demonstrate that the discovered hybrid compression operators significantly outperform manually designed approaches, even under aggressive visual token compression, achieving superior accuracy–efficiency trade-offs across multiple multimodal benchmarks.

📝 Abstract

Efficient multimodal foundation models often rely on manually designed token-reduction operators, such as pruning, merging, pooling, and adaptive reweighting. Although these operators appear different, we show that they can be interpreted as distinct regimes of a shared operator space. Based on this view, we introduce Efficient Operator Search, a differentiable framework that jointly searches where to reduce tokens, how many tokens to retain, and how reduced token information should be processed. The proposed search space parameterizes layer activation, retention budget, and operator behavior, while the search policy optimizes task performance under one-sided budget and cost constraints. This formulation recovers representative hand-designed baselines as special cases and further discovers hybrid operators beyond isolated manual designs. Experiments on multimodal benchmarks show that the searched operators achieve competitive accuracy-efficiency trade-offs, especially under aggressive visual-token reduction. These results suggest that efficient multimodal inference can be reframed from manual operator design to differentiable operator search.

Problem

Research questions and friction points this paper is trying to address.

multimodal foundation models

token reduction

efficient inference

operator search

differentiable architecture search

Innovation

Methods, ideas, or system contributions that make the work stand out.

differentiable architecture search

token reduction

multimodal foundation models