🤖 AI Summary
To address the insufficient robustness of investment decisions in financial markets—characterized by high dimensionality, non-stationarity, and low signal-to-noise ratios—this paper proposes a reinforcement learning (RL) portfolio optimization framework integrating dynamic state embedding and online meta-learning. The framework employs a generative autoencoder to compress the temporal state space and disentangles time-series features to enhance representation stability. Crucially, it pioneers the coupling of online meta-learning with RL policy updates, enabling automatic risk timing and dynamic position adjustment under market volatility. Empirical evaluation on S&P 500 constituents shows that the model achieves significantly higher Sharpe ratios compared to equal-weighted (EW), mean-variance (MV), and baseline RL portfolios, with gains up to 37% during stress periods. Ablation studies confirm both cross-algorithm robustness and the necessity of each core module.
📝 Abstract
We develop a portfolio allocation framework that leverages deep learning techniques to address challenges arising from high-dimensional, non-stationary, and low-signal-to-noise market information. Our approach includes a dynamic embedding method that reduces the non-stationary, high-dimensional state space into a lower-dimensional representation. We design a reinforcement learning (RL) framework that integrates generative autoencoders and online meta-learning to dynamically embed market information, enabling the RL agent to focus on the most impactful parts of the state space for portfolio allocation decisions. Empirical analysis based on the top 500 U.S. stocks demonstrates that our framework outperforms common portfolio benchmarks and the predict-then-optimize (PTO) approach using machine learning, particularly during periods of market stress. Traditional factor models do not fully explain this superior performance. The framework's ability to time volatility reduces its market exposure during turbulent times. Ablation studies confirm the robustness of this performance across various reinforcement learning algorithms. Additionally, the embedding and meta-learning techniques effectively manage the complexities of high-dimensional, noisy, and non-stationary financial data, enhancing both portfolio performance and risk management.