🤖 AI Summary
Automated stock trading under highly uncertain market and economic conditions remains challenging.
Method: This paper proposes an adaptive intelligent trading agent based on deep reinforcement learning (DRL). It introduces a customized stock simulation environment and conducts the first comparative evaluation—on real-world, cross-period data (pre- and post-2021)—of three DRL paradigms: Deep Q-Networks (DQN), Deep SARSA, and policy gradient methods, assessing their robustness and generalization capability. Financial time-series modeling is integrated to enhance state representation.
Contribution/Results: The agent achieves 70%–90% annualized returns pre-2021 and sustains positive returns of 2%–7% during the extreme volatility of 2021–2022—significantly outperforming benchmark strategies. Its core contribution lies in empirically validating a multi-algorithm collaborative evaluation framework for dynamic financial environments, establishing a reproducible methodology and empirical foundation for robust, RL-driven quantitative trading.
📝 Abstract
Stock trading is one of the popular ways for financial management. However, the market and the environment of economy is unstable and usually not predictable. Furthermore, engaging in stock trading requires time and effort to analyze, create strategies, and make decisions. It would be convenient and effective if an agent could assist or even do the task of analyzing and modeling the past data and then generate a strategy for autonomous trading. Recently, reinforcement learning has been shown to be robust in various tasks that involve achieving a goal with a decision making strategy based on time-series data. In this project, we have developed a pipeline that simulates the stock trading environment and have trained an agent to automate the stock trading process with deep reinforcement learning methods, including deep Q-learning, deep SARSA, and the policy gradient method. We evaluate our platform during relatively good (before 2021) and bad (2021 - 2022) situations. The stocks we've evaluated on including Google, Apple, Tesla, Meta, Microsoft, and IBM. These stocks are among the popular ones, and the changes in trends are representative in terms of having good and bad situations. We showed that before 2021, the three reinforcement methods we have tried always provide promising profit returns with total annual rates around $70%$ to $90%$, while maintain a positive profit return after 2021 with total annual rates around 2% to 7%.