AIRHILT: A Human-in-the-Loop Testbed for Multimodal Conflict Detection in Aviation

📅 2025-11-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of multimodal integration—specifically, the difficulty in synergizing speech, visual, and ADS-B data—and the lack of human-in-the-loop (HITL) evaluation in aviation conflict detection, this work introduces the first open-source, modular, and lightweight HITL simulation platform. Built on the Godot engine, it integrates fine-tuned Whisper ASR, YOLOv8-based visual detection, ADS-B message parsing, and GPT-OSS-20B for structured reasoning, all interconnected via standardized JSON APIs to enable plug-and-play model integration and reproducible multi-scenario testing. The platform supports representative conflict scenarios—including runway incursions and en-route conflicts—in terminal maneuvering areas and airways. Empirical evaluation yields an average time-to-first-alert of 7.7 seconds, with ASR and vision processing latencies of ~5.9 s and 0.4 s, respectively, validating effective multimodal synergy. Key contributions include: (1) a unified benchmarking framework; (2) a reproducible scenario suite; and (3) end-to-end HITL evaluation capability—collectively enhancing development efficiency and trustworthiness of flight safety assistance systems.

Technology Category

Application Category

📝 Abstract
We introduce AIRHILT (Aviation Integrated Reasoning, Human-in-the-Loop Testbed), a modular and lightweight simulation environment designed to evaluate multimodal pilot and air traffic control (ATC) assistance systems for aviation conflict detection. Built on the open-source Godot engine, AIRHILT synchronizes pilot and ATC radio communications, visual scene understanding from camera streams, and ADS-B surveillance data within a unified, scalable platform. The environment supports pilot- and controller-in-the-loop interactions, providing a comprehensive scenario suite covering both terminal area and en route operational conflicts, including communication errors and procedural mistakes. AIRHILT offers standardized JSON-based interfaces that enable researchers to easily integrate, swap, and evaluate automatic speech recognition (ASR), visual detection, decision-making, and text-to-speech (TTS) models. We demonstrate AIRHILT through a reference pipeline incorporating fine-tuned Whisper ASR, YOLO-based visual detection, ADS-B-based conflict logic, and GPT-OSS-20B structured reasoning, and present preliminary results from representative runway-overlap scenarios, where the assistant achieves an average time-to-first-warning of approximately 7.7 s, with average ASR and vision latencies of approximately 5.9 s and 0.4 s, respectively. The AIRHILT environment and scenario suite are openly available, supporting reproducible research on multimodal situational awareness and conflict detection in aviation; code and scenarios are available at https://github.com/ogarib3/airhilt.
Problem

Research questions and friction points this paper is trying to address.

Evaluating multimodal pilot and ATC assistance systems for aviation conflict detection
Synchronizing radio communications, visual scene understanding, and ADS-B surveillance data
Supporting pilot- and controller-in-the-loop interactions for operational conflict scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular simulation environment for aviation conflict detection
Integrates multimodal data using open-source Godot engine
Standardized JSON interfaces for flexible model integration
🔎 Similar Papers
No similar papers found.
O
Omar Garib
Daniel Guggenheim School of Aerospace Engineering, Georgia Institute of Technology
J
Jayaprakash D. Kambhampaty
Daniel Guggenheim School of Aerospace Engineering, Georgia Institute of Technology
O
Olivia J. Pinon Fischer
Daniel Guggenheim School of Aerospace Engineering, Georgia Institute of Technology
Dimitri N. Mavris
Dimitri N. Mavris
Georgia Institute of Technology
EngineeringTechnologyAerospace