CLIP-Optimized Multimodal Image Enhancement via ISP-CNN Fusion for Coal Mine IoVT under Uneven Illumination

๐Ÿ“… 2025-02-26
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Image quality in underground coal mine Internet of Visual Things (IoVT) systems severely degrades under low- and non-uniform illumination, while existing enhancement methods rely on scarce paired ground-truth imagesโ€”impractical in such environments. Method: We propose an unsupervised, real-time image enhancement method tailored for edge deployment. It introduces a CLIP-driven multimodal semantic constraint mechanism for cross-modal perceptual guidance and adopts an ISP-CNN hybrid two-stage architecture: the first stage performs global brightness correction, and the second enhances local details with adaptive illumination modeling. No paired supervision is required. Contribution/Results: The method achieves real-time inference on resource-constrained edge devices. Quantitative evaluation shows consistent improvements over seven state-of-the-art methods: +2.9โ€“4.9% in PSNR, +4.3โ€“11.4% in SSIM, and +4.9โ€“17.8% in VIF, demonstrating superior visual fidelity and structural preservation under challenging illumination conditions.

Technology Category

Application Category

๐Ÿ“ Abstract
Clear monitoring images are crucial for the safe operation of coal mine Internet of Video Things (IoVT) systems. However, low illumination and uneven brightness in underground environments significantly degrade image quality, posing challenges for enhancement methods that often rely on difficult-to-obtain paired reference images. Additionally, there is a trade-off between enhancement performance and computational efficiency on edge devices within IoVT systems.To address these issues, we propose a multimodal image enhancement method tailored for coal mine IoVT, utilizing an ISP-CNN fusion architecture optimized for uneven illumination. This two-stage strategy combines global enhancement with detail optimization, effectively improving image quality, especially in poorly lit areas. A CLIP-based multimodal iterative optimization allows for unsupervised training of the enhancement algorithm. By integrating traditional image signal processing (ISP) with convolutional neural networks (CNN), our approach reduces computational complexity while maintaining high performance, making it suitable for real-time deployment on edge devices.Experimental results demonstrate that our method effectively mitigates uneven brightness and enhances key image quality metrics, with PSNR improvements of 2.9%-4.9%, SSIM by 4.3%-11.4%, and VIF by 4.9%-17.8% compared to seven state-of-the-art algorithms. Simulated coal mine monitoring scenarios validate our method's ability to balance performance and computational demands, facilitating real-time enhancement and supporting safer mining operations.
Problem

Research questions and friction points this paper is trying to address.

Enhances coal mine IoVT images under uneven illumination
Reduces computational complexity for edge devices
Improves image quality without paired reference images
Innovation

Methods, ideas, or system contributions that make the work stand out.

ISP-CNN fusion architecture
CLIP-based multimodal optimization
Real-time edge device deployment
๐Ÿ”Ž Similar Papers
No similar papers found.
S
Shuai Wang
School of Artificial Intelligence, China University of Mining and Technology (Beijing), Beijing 100083, China
Shihao Zhang
Shihao Zhang
University of California, San Diego
Applied Mathematics
J
Jiaqi Wu
School of Artificial Intelligence, China University of Mining and Technology (Beijing), Beijing 100083, China; Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
Z
Zijian Tian
School of Artificial Intelligence, China University of Mining and Technology (Beijing), Beijing 100083, China
W
Wei Chen
Ministry of Emergency Management Big Data Center, Beijing 100013, China; School of Mechanical Electronic & Information Engineering, China University of Mining and Technology (Beijing), Beijing 100083, China
T
Tongzhu Jin
School of Artificial Intelligence, China University of Mining and Technology (Beijing), Beijing 100083, China
M
Miaomiao Xue
School of Artificial Intelligence, China University of Mining and Technology (Beijing), Beijing 100083, China
Zehua Wang
Zehua Wang
Prof. of Blockchain at UBC
blockchain systemscybersecuritymechanism designcommunication systems
F
Fei Richard Yu
Department of Systems and Computer Engineering, Carleton University, Ottawa, ON K1S 5B6, Canada
Victor C. M. Leung
Victor C. M. Leung
SMBU / Shenzhen University / The University of British Columbia
communication systemswireless networksmobile systems