🤖 AI Summary
Large language models (LLMs) suffer from context mismatch and semantic drift when generating financial trading signals. To address this, we propose Reinforcement Learning-based Prompt Fine-tuning (RLMF), a novel framework that integrates temporal market features and real-world trading rewards directly into LLM instruction tuning. RLMF combines domain-specific financial prompt engineering with joint encoding of historical market conditions and short-horizon price dynamics, enabling end-to-end optimization on LLaMA-3.2-3B-Instruct. As the first RL-oriented prompt paradigm tailored for financial decision-making, RLMF achieves state-of-the-art performance in FinRL Contest 2024 Task II: it improves trading signal consistency by 37% and reduces return volatility by 29%, significantly outperforming conventional sentiment analysis and baseline LLM approaches.
📝 Abstract
In response to Task II of the FinRL Challenge at ACM ICAIF 2024, this study proposes a novel prompt framework for fine-tuning large language models (LLM) with Reinforcement Learning from Market Feedback (RLMF). Our framework incorporates market-specific features and short-term price dynamics to generate more precise trading signals. Traditional LLMs, while competent in sentiment analysis, lack contextual alignment for financial market applications. To bridge this gap, we fine-tune the LLaMA-3.2-3B-Instruct model using a custom RLMF prompt design that integrates historical market data and reward-based feedback. Our evaluation shows that this RLMF-tuned framework outperforms baseline methods in signal consistency and achieving tighter trading outcomes; awarded as winner of Task II. You can find the code for this project on GitHub.