U-CESE: Unified Clip-based Event Search Engine for AI Challenge HCMC 2025

📅 2026-05-22

📈 Citations: 0

✨ Influential: 0

career value

177K/year

🤖 AI Summary

This work addresses the challenges of spatiotemporal complexity and multimodal fusion in event retrieval from large-scale video collections. The authors propose a unified clip-based multimodal event retrieval framework that integrates three key innovations: a unified video cropping algorithm, a training-free lightweight keyframe extraction method named DAKE—leveraging JPEG file size variations—and ReCap, a temporally coherent captioning model inspired by recurrent neural networks. This framework supports diverse query modalities and demonstrates robust, efficient, and semantically consistent event retrieval performance, as evidenced by its results in the AI Challenge HCMC 2025.

📝 Abstract

Retrieving events from large-scale video datasets is challenging due to complex temporal, spatial, and multimodal information. This paper presents U-CESE, our solution for the AI Challenge HCMC 2025, a Unified Clip-based Event Search Engine for multimodal event retrieval across diverse video sources. Building on CESE, U-CESE integrates its three modules into a single cohesive framework, ensuring consistent processing and retrieval across query types. A core component is the Unified Clipping Algorithm, which merges separate clipping algorithms into one efficient pipeline. To handle large-scale data, we propose DAKE, a lightweight, training-free keyframe extraction method using JPEG file size variations to identify significant scene changes. Finally, we introduce ReCap, a temporally consistent captioning framework inspired by Recurrent Neural Network, generating detailed and context-aware textual descriptions. Experiments show that U-CESE delivers robust, consistent, and efficient performance in large-scale multimodal event retrieval.

Problem

Research questions and friction points this paper is trying to address.

event retrieval

large-scale video

multimodal information

temporal complexity

spatial complexity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified Clipping Algorithm

DAKE

ReCap