🤖 AI Summary
This paper addresses the poor generalizability and physical inconsistency of existing deep learning models for acoustic modeling, which stem from their neglect of the inherent spatiotemporal structure of wave propagation. To tackle this, we propose a geometric- and wave-aware deep acoustic modeling paradigm. Through a systematic review of tasks such as sound field reconstruction, we establish a unified framework comparing physics-based and data-driven approaches, yielding a structured taxonomy that identifies three core bottlenecks: limited generalizability, low interpretability, and weak physical consistency. Methodologically, we innovatively incorporate wave equation constraints and physics-informed neural network (PINN) principles, augmented by cross-modal transfer mechanisms adapted from speech and image domains. This integration significantly enhances physical consistency and robustness. Finally, we distill six key open challenges, providing both theoretical foundations and technical pathways toward interpretable, generalizable, and physics-driven intelligent acoustic modeling.
📝 Abstract
Our everyday auditory experience is shaped by the acoustics of the indoor environments in which we live. Room acoustics modeling is aimed at establishing mathematical representations of acoustic wave propagation in such environments. These representations are relevant to a variety of problems ranging from echo-aided auditory indoor navigation to restoring speech understanding in cocktail party scenarios. Many disciplines in science and engineering have recently witnessed a paradigm shift powered by deep learning (DL), and room acoustics research is no exception. The majority of deep, data-driven room acoustics models are inspired by DL-based speech and image processing, and hence lack the intrinsic space-time structure of acoustic wave propagation. More recently, DL-based models for room acoustics that include either geometric or wave-based information have delivered promising results, primarily for the problem of sound field reconstruction. In this review paper, we will provide an extensive and structured literature review on deep, data-driven modeling in room acoustics. Moreover, we position these models in a framework that allows for a conceptual comparison with traditional physical and data-driven models. Finally, we identify strengths and shortcomings of deep, data-driven room acoustics models and outline the main challenges for further research.