🤖 AI Summary
This study addresses the challenge of fine-grained sentiment analysis over heterogeneous financial texts—characterized by multi-source origins, diverse formats, and multilingual content—under computationally constrained settings. We propose an efficient fine-tuning and zero-/few-shot learning framework leveraging only 5% labeled data. Evaluated across five public financial benchmarks (including FinancialPhraseBank), models such as FinBERT, DeepSeek-7B, Llama3-8B-Instruct, and Qwen3-8B demonstrate that lightweight open-weight LLMs—particularly Qwen3-8B and Llama3-8B—significantly outperform conventional financial NLP models under low-resource conditions, matching state-of-the-art performance. Our key contribution is the first empirical validation that lightweight open-source LLMs achieve high generalizability and robustness in financial sentiment analysis without domain-specific pretraining data or high-end computational resources. This establishes a new paradigm for cost-effective, reproducible, and accessible financial NLP.
📝 Abstract
Large language models (LLMs) play an increasingly important role in finan- cial markets analysis by capturing signals from complex and heterogeneous textual data sources, such as tweets, news articles, reports, and microblogs. However, their performance is dependent on large computational resources and proprietary datasets, which are costly, restricted, and therefore inacces- sible to many researchers and practitioners. To reflect realistic situations we investigate the ability of lightweight open-source LLMs - smaller and publicly available models designed to operate with limited computational resources - to generalize sentiment understanding from financial datasets of varying sizes, sources, formats, and languages. We compare the benchmark finance natural language processing (NLP) model, FinBERT, and three open-source lightweight LLMs, DeepSeek-LLM 7B, Llama3 8B Instruct, and Qwen3 8B on five publicly available datasets: FinancialPhraseBank, Financial Question Answering, Gold News Sentiment, Twitter Sentiment and Chinese Finance Sentiment. We find that LLMs, specially Qwen3 8B and Llama3 8B, perform best in most scenarios, even from using only 5% of the available training data. These results hold in zero-shot and few-shot learning scenarios. Our findings indicate that lightweight, open-source large language models (LLMs) consti- tute a cost-effective option, as they can achieve competitive performance on heterogeneous textual data even when trained on only a limited subset of the extensive annotated corpora that are typically deemed necessary.