Cost-Sensitive Learning for Long-Tailed Temporal Action Segmentation

📅 2025-03-24

📈 Citations: 1

✨ Influential: 1

career value

182K/year

🤖 AI Summary

This work addresses the long-tailed distribution problem in temporal action segmentation for untrimmed procedural videos, identifying two inherent learning biases: head-class preference at the category level and frequent-transition preference at the transition level. To tackle these, we propose the first joint modeling framework for action categories and their state transitions. Our method introduces a dual-level adaptive weighting loss function—integrating a state-aware cost-sensitive term with dynamically weighted cross-entropy—to jointly mitigate class imbalance and transition bias. Built upon a constrained optimization framework, we conduct extensive multi-benchmark transfer evaluations across three mainstream datasets. Experiments demonstrate significant improvements in both frame-level and segment-level performance, particularly in tail-class action recognition accuracy, while exhibiting superior generalization over existing baselines.

Technology Category

Application Category

📝 Abstract

Temporal action segmentation in untrimmed procedural videos aims to densely label frames into action classes. These videos inherently exhibit long-tailed distributions, where actions vary widely in frequency and duration. In temporal action segmentation approaches, we identified a bi-level learning bias. This bias encompasses (1) a class-level bias, stemming from class imbalance favoring head classes, and (2) a transition-level bias arising from variations in transitions, prioritizing commonly observed transitions. As a remedy, we introduce a constrained optimization problem to alleviate both biases. We define learning states for action classes and their associated transitions and integrate them into the optimization process. We propose a novel cost-sensitive loss function formulated as a weighted cross-entropy loss, with weights adaptively adjusted based on the learning state of actions and their transitions. Experiments on three challenging temporal segmentation benchmarks and various frameworks demonstrate the effectiveness of our approach, resulting in significant improvements in both per-class frame-wise and segment-wise performance.

Problem

Research questions and friction points this paper is trying to address.

Addresses long-tailed class imbalance in action segmentation

Mitigates bi-level learning bias in temporal transitions

Proposes cost-sensitive loss for adaptive class weighting

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cost-sensitive loss for class imbalance

Adaptive weights for transitions

Constrained optimization for biases

🔎 Similar Papers

No similar papers found.