CompressAI-Vision: Open-source software to evaluate compression methods for computer vision tasks

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The absence of a unified evaluation benchmark for video compression optimized for downstream vision tasks (e.g., detection, recognition) hinders standardization of Feature Coding for Machines (FCM). Method: This paper introduces the first open-source evaluation platform supporting joint optimization across multiple dimensions: vision tasks (image/video understanding), model architectures (CNNs/Transformers), and codecs (traditional and deep learning-based). It proposes a novel rate-distortion–task-accuracy joint analysis framework, integrating standardized codecs with differentiable neural compression modules to enable end-to-end co-evaluation of compression efficiency and downstream task performance. Contribution/Results: Extensive validation on multiple benchmark datasets demonstrates significant improvements in task-driven compression efficiency. The platform has been formally adopted by MPEG as the core infrastructure for FCM standard development and evaluation.

Technology Category

Application Category

📝 Abstract
With the increasing use of neural network (NN)-based computer vision applications that process image and video data as input, interest has emerged in video compression technology optimized for computer vision tasks. In fact, given the variety of vision tasks, associated NN models and datasets, a consolidated platform is needed as a common ground to implement and evaluate compression methods optimized for downstream vision tasks. CompressAI-Vision is introduced as a comprehensive evaluation platform where new coding tools compete to efficiently compress the input of vision network while retaining task accuracy in the context of two different inference scenarios: "remote" and "split" inferencing. Our study showcases various use cases of the evaluation platform incorporated with standard codecs (under development) by examining the compression gain on several datasets in terms of bit-rate versus task accuracy. This evaluation platform has been developed as open-source software and is adopted by the Moving Pictures Experts Group (MPEG) for the development the Feature Coding for Machines (FCM) standard. The software is available publicly at https://github.com/InterDigitalInc/CompressAI-Vision.
Problem

Research questions and friction points this paper is trying to address.

Evaluating compression methods optimized for computer vision tasks
Providing a platform to test coding tools for vision network input compression
Assessing compression efficiency while maintaining task accuracy in inference scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-source platform for compression evaluation
Supports remote and split inference scenarios
Evaluates bit-rate versus task accuracy trade-offs
🔎 Similar Papers
No similar papers found.