Post-mastoidectomy Surface Multi-View Synthesis from a Single Microscopy Image

📅 2024-08-31

📈 Citations: 1

✨ Influential: 0

career value

190K/year

🤖 AI Summary

To address the need for augmented reality (AR) navigation in cochlear implantation surgery, this paper proposes a multi-view video synthesis method jointly driven by a single intraoperative microscope image and preoperative CT scans. The method integrates CT-reconstructed postoperative mastoid surface geometry, manually guided UV texture mapping, and high-fidelity rendering using PyTorch3D and PyVista to establish an end-to-end synthesis pipeline. It achieves, for the first time, anatomy-consistent surface texture reconstruction from a single input image and generates multi-view videos with precise ground-truth camera poses. Synthesized images attain an SSIM of 0.86. Furthermore, the work introduces the first large-scale synthetic dataset for otologic surgery annotated with accurate 6-DoF pose labels. This dataset significantly enhances the robustness and generalization capability of real-time 2D–3D registration methods in clinical settings.

Technology Category

Application Category

📝 Abstract

Cochlear Implant (CI) procedures involve performing an invasive mastoidectomy to insert an electrode array into the cochlea. In this paper, we introduce a novel pipeline that is capable of generating synthetic multi-view videos from a single CI microscope image. In our approach, we use a patient's pre-operative CT scan to predict the post-mastoidectomy surface using a method designed for this purpose. We manually align the surface with a selected microscope frame to obtain an accurate initial pose of the reconstructed CT mesh relative to the microscope. We then perform UV projection to transfer the colors from the frame to surface textures. Novel views of the textured surface can be used to generate a large dataset of synthetic frames with ground truth poses. We evaluated the quality of synthetic views rendered using Pytorch3D and PyVista. We found both rendering engines lead to similarly high-quality synthetic novel-view frames compared to ground truth with a structural similarity index for both methods averaging about 0.86. A large dataset of novel views with known poses is critical for ongoing training of a method to automatically estimate microscope pose for 2D to 3D registration with the pre-operative CT to facilitate augmented reality surgery. This dataset will empower various downstream tasks, such as integrating Augmented Reality (AR) in the OR, tracking surgical tools, and supporting other video analysis studies.

Problem

Research questions and friction points this paper is trying to address.

Generates synthetic multi-view videos from single microscope image.

Predicts post-mastoidectomy surface using pre-operative CT scan.

Facilitates augmented reality surgery with accurate 2D to 3D registration.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates multi-view videos from single microscope image

Uses pre-operative CT scan for post-mastoidectomy surface prediction

Creates synthetic frames with ground truth poses for AR

🔎 Similar Papers

Self-supervised Mamba-based Mastoidectomy Shape Prediction for Cochlear Implant Surgery

2024-07-22Citations: 1

World Labs

$250,000-$350,000 base salary (good-faith estimate for San Francisco Bay Area upon hire; actual offer based on experience, skills, and qualifications)

San Francisco / San Francisco Office, San Francisco, California, United States

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)