Hybrid machine learning data assimilation for marine biogeochemistry

📅 2025-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional data assimilation methods struggle to effectively update unobserved biogeochemical variables—such as specific nutrients and phytoplankton functional groups—in marine biogeochemical models due to strong nonlinearities, sparse and uncertain observations, and the high computational cost of ensemble-based approaches. Method: This study proposes a machine learning (ML)-enhanced hybrid assimilation framework that innovatively integrates an ML-driven balance scheme into the assimilation system for the first time, enabling state-dependent correlation modeling and end-to-end analysis increment prediction. It synergistically combines deep neural networks with the ensemble Kalman filter (EnKF), leveraging statistical relationships learned from free-running ensembles. Contribution/Results: Validated in a 1D prototype system over the Nordic Shelf Sea, the framework significantly improves accuracy in updating unobserved variables while reducing computational cost substantially compared to conventional ensemble methods. It demonstrates moderate cross-regional transferability, offering a promising pathway toward operational 3D marine biogeochemical forecasting.

Technology Category

Application Category

📝 Abstract
Marine biogeochemistry models are critical for forecasting, as well as estimating ecosystem responses to climate change and human activities. Data assimilation (DA) improves these models by aligning them with real-world observations, but marine biogeochemistry DA faces challenges due to model complexity, strong nonlinearity, and sparse, uncertain observations. Existing DA methods applied to marine biogeochemistry struggle to update unobserved variables effectively, while ensemble-based methods are computationally too expensive for high-complexity marine biogeochemistry models. This study demonstrates how machine learning (ML) can improve marine biogeochemistry DA by learning statistical relationships between observed and unobserved variables. We integrate ML-driven balancing schemes into a 1D prototype of a system used to forecast marine biogeochemistry in the North-West European Shelf seas. ML is applied to predict (i) state-dependent correlations from free-run ensembles and (ii), in an ``end-to-end'' fashion, analysis increments from an Ensemble Kalman Filter. Our results show that ML significantly enhances updates for previously not-updated variables when compared to univariate schemes akin to those used operationally. Furthermore, ML models exhibit moderate transferability to new locations, a crucial step toward scaling these methods to 3D operational systems. We conclude that ML offers a clear pathway to overcome current computational bottlenecks in marine biogeochemistry DA and that refining transferability, optimizing training data sampling, and evaluating scalability for large-scale marine forecasting, should be future research priorities.
Problem

Research questions and friction points this paper is trying to address.

Improving marine biogeochemistry models with machine learning
Overcoming computational bottlenecks in data assimilation
Enhancing updates for unobserved variables in models
Innovation

Methods, ideas, or system contributions that make the work stand out.

ML learns observed-unobserved variable relationships
ML predicts state-dependent correlations from ensembles
ML enhances updates for unobserved variables
🔎 Similar Papers
No similar papers found.
I
Ieuan Higgs
Department of Meteorology, University of Reading, UK; National Centre for Earth Observation, UK
R
Ross Bannister
Department of Meteorology, University of Reading, UK; National Centre for Earth Observation, UK
J
Jozef Sk'akala
National Centre for Earth Observation, UK; Plymouth Marine Laboratory, UK
Alberto Carrassi
Alberto Carrassi
University of Bologna (IT) and University of Reading (UK)
Data assimilationDynamical SystemsMachine Learning
S
Stefano Ciavatta
Mercator Ocean International, FR