🤖 AI Summary
Modeling the association between intraoperative surgical behaviors and clinical outcomes in robot-assisted radical prostatectomy (RARP) remains challenging. Method: We propose F2O, an end-to-end AI system that jointly models frame-level gesture recognition and postoperative erectile function recovery prediction—first of its kind. F2O integrates spatiotemporal Transformer architectures, video sequence parsing, and interpretable feature extraction (e.g., gesture frequency, duration, and transition patterns) to generate behavior representations highly consistent with expert annotations (r = 0.96, p < 1×10⁻¹⁴). Contribution/Results: During the nerve-sparing phase, gesture recognition achieves AUCs of 0.80 (frame-level) and 0.81 (video-level); postoperative outcome prediction attains 79% accuracy. Crucially, specific tissue dissection patterns significantly correlate with erectile function recovery, establishing a novel, data-driven paradigm for objective surgical quality assessment and personalized prognostication.
📝 Abstract
Fine-grained analysis of intraoperative behavior and its impact on patient outcomes remain a longstanding challenge. We present Frame-to-Outcome (F2O), an end-to-end system that translates tissue dissection videos into gesture sequences and uncovers patterns associated with postoperative outcomes. Leveraging transformer-based spatial and temporal modeling and frame-wise classification, F2O robustly detects consecutive short (~2 seconds) gestures in the nerve-sparing step of robot-assisted radical prostatectomy (AUC: 0.80 frame-level; 0.81 video-level). F2O-derived features (gesture frequency, duration, and transitions) predicted postoperative outcomes with accuracy comparable to human annotations (0.79 vs. 0.75; overlapping 95% CI). Across 25 shared features, effect size directions were concordant with small differences (~ 0.07), and strong correlation (r = 0.96, p < 1e-14). F2O also captured key patterns linked to erectile function recovery, including prolonged tissue peeling and reduced energy use. By enabling automatic interpretable assessment, F2O establishes a foundation for data-driven surgical feedback and prospective clinical decision support.