Post-mastoidectomy Surface Multi-View Synthesis from a Single Microscopy Image

šŸ“… 2024-08-31
šŸ“ˆ Citations: 1
✨ Influential: 0
šŸ“„ PDF
šŸ¤– AI Summary
To address the need for augmented reality (AR) navigation in cochlear implantation surgery, this paper proposes a multi-view video synthesis method jointly driven by a single intraoperative microscope image and preoperative CT scans. The method integrates CT-reconstructed postoperative mastoid surface geometry, manually guided UV texture mapping, and high-fidelity rendering using PyTorch3D and PyVista to establish an end-to-end synthesis pipeline. It achieves, for the first time, anatomy-consistent surface texture reconstruction from a single input image and generates multi-view videos with precise ground-truth camera poses. Synthesized images attain an SSIM of 0.86. Furthermore, the work introduces the first large-scale synthetic dataset for otologic surgery annotated with accurate 6-DoF pose labels. This dataset significantly enhances the robustness and generalization capability of real-time 2D–3D registration methods in clinical settings.

Technology Category

Application Category

šŸ“ Abstract
Cochlear Implant (CI) procedures involve performing an invasive mastoidectomy to insert an electrode array into the cochlea. In this paper, we introduce a novel pipeline that is capable of generating synthetic multi-view videos from a single CI microscope image. In our approach, we use a patient's pre-operative CT scan to predict the post-mastoidectomy surface using a method designed for this purpose. We manually align the surface with a selected microscope frame to obtain an accurate initial pose of the reconstructed CT mesh relative to the microscope. We then perform UV projection to transfer the colors from the frame to surface textures. Novel views of the textured surface can be used to generate a large dataset of synthetic frames with ground truth poses. We evaluated the quality of synthetic views rendered using Pytorch3D and PyVista. We found both rendering engines lead to similarly high-quality synthetic novel-view frames compared to ground truth with a structural similarity index for both methods averaging about 0.86. A large dataset of novel views with known poses is critical for ongoing training of a method to automatically estimate microscope pose for 2D to 3D registration with the pre-operative CT to facilitate augmented reality surgery. This dataset will empower various downstream tasks, such as integrating Augmented Reality (AR) in the OR, tracking surgical tools, and supporting other video analysis studies.
Problem

Research questions and friction points this paper is trying to address.

Generates synthetic multi-view videos from single microscope image.
Predicts post-mastoidectomy surface using pre-operative CT scan.
Facilitates augmented reality surgery with accurate 2D to 3D registration.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates multi-view videos from single microscope image
Uses pre-operative CT scan for post-mastoidectomy surface prediction
Creates synthetic frames with ground truth poses for AR
šŸ”Ž Similar Papers
No similar papers found.