A Unified BERT-CNN-BiLSTM Framework for Simultaneous Headline Classification and Sentiment Analysis of Bangla News

📅 2025-11-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the low-resource NLP tasks of Bangla news headline classification and sentiment analysis. We propose the first joint modeling framework for these tasks, built upon the BAN-ABSA dataset. Our architecture integrates BERT for deep semantic encoding, CNN for local n-gram feature extraction, and BiLSTM for capturing long-range contextual dependencies. To mitigate class imbalance, we combine SMOTE-based oversampling with Tomek Links-based undersampling. Experimental results demonstrate state-of-the-art performance: 81.37% accuracy on headline classification and 73.43% F1-score on sentiment analysis—significantly outperforming both single-task baselines and existing multi-task approaches. To our knowledge, this is the first work enabling end-to-end multi-task understanding of Bangla news headlines. It establishes a reproducible technical paradigm and benchmark solution for fine-grained text analysis in low-resource languages.

Technology Category

Application Category

📝 Abstract
In our daily lives, newspapers are an essential information source that impacts how the public talks about present-day issues. However, effectively navigating the vast amount of news content from different newspapers and online news portals can be challenging. Newspaper headlines with sentiment analysis tell us what the news is about (e.g., politics, sports) and how the news makes us feel (positive, negative, neutral). This helps us quickly understand the emotional tone of the news. This research presents a state-of-the-art approach to Bangla news headline classification combined with sentiment analysis applying Natural Language Processing (NLP) techniques, particularly the hybrid transfer learning model BERT-CNN-BiLSTM. We have explored a dataset called BAN-ABSA of 9014 news headlines, which is the first time that has been experimented with simultaneously in the headline and sentiment categorization in Bengali newspapers. Over this imbalanced dataset, we applied two experimental strategies: technique-1, where undersampling and oversampling are applied before splitting, and technique-2, where undersampling and oversampling are applied after splitting on the In technique-1 oversampling provided the strongest performance, both headline and sentiment, that is 78.57% and 73.43% respectively, while technique-2 delivered the highest result when trained directly on the original imbalanced dataset, both headline and sentiment, that is 81.37% and 64.46% respectively. The proposed model BERT-CNN-BiLSTM significantly outperforms all baseline models in classification tasks, and achieves new state-of-the-art results for Bangla news headline classification and sentiment analysis. These results demonstrate the importance of leveraging both the headline and sentiment datasets, and provide a strong baseline for Bangla text classification in low-resource.
Problem

Research questions and friction points this paper is trying to address.

Classifying Bangla news headlines into categories like politics or sports
Analyzing sentiment of Bangla news as positive, negative or neutral
Addressing data imbalance challenges in low-resource Bengali text classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

BERT-CNN-BiLSTM hybrid model for Bangla NLP
Applied undersampling and oversampling strategies
Simultaneously classifies headlines and sentiment
🔎 Similar Papers
No similar papers found.
M
Mirza Raquib
Department of CCE, International Islamic University Chittagong, Chattogram, Bangladesh; Department of ICE, Noakhali Science and Technology University, Noakhali, Bangladesh
M
Munazer Montasir Akash
Department of CSE, Banladesh University of Engineering and Technology, Dhaka, Bangladesh; Dept. of CSE, IIUC, Chattogram, Bangladesh
T
Tawhid Ahmed
Department of CSE, Banladesh University of Engineering and Technology, Dhaka, Bangladesh
Saydul Akbar Murad
Saydul Akbar Murad
PhD Student, University of Southern Mississippi
Machine LearningBCINeuroscienceEEGP2P Communication
Farida Siddiqi Prity
Farida Siddiqi Prity
Lecturer, Department of CSE, Netrokona University
Artificial IntelligenceImage processingDeep LearningMachine LearningCloud Computing
M
Mohammad Amzad Hossain
Department of ICE, NSTU, Noakhali, Bangladesh
A
Asif Pervez Polok
mPower Social Enterprise
Nick Rahimi
Nick Rahimi
Associate Professor, University of Southern Mississippi
CybersecurityTrustworthy AIDistributed SystemsP2P Network