Clickbait detection: quick inference with maximum impact

📅 2026-04-09

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

This study addresses the challenge of balancing inference efficiency and accuracy in clickbait headline detection by proposing a lightweight hybrid architecture that integrates OpenAI semantic embeddings with six handcrafted heuristic features. Following dimensionality reduction via PCA, the combined representation is fed into XGBoost, GraphSAGE, and GCN classifiers. The approach achieves substantially reduced inference time with only a marginal decrease in F1 score, while maintaining high ROC-AUC performance—demonstrating robust discriminative capability across varying classification thresholds. The key innovation lies in synergistically combining semantic embeddings, feature engineering, and efficient graph-based models to enable accurate and computationally tractable clickbait identification.

Technology Category

Application Category

📝 Abstract

We propose a lightweight hybrid approach to clickbait detection that combines OpenAI semantic embeddings with six compact heuristic features capturing stylistic and informational cues. To improve efficiency, embeddings are reduced using PCA and evaluated with XGBoost, GraphSAGE, and GCN classifiers. While the simplified feature design yields slightly lower F1-scores, graph-based models achieve competitive performance with substantially reduced inference time. High ROC--AUC values further indicate strong discrimination capability, supporting reliable detection of clickbait headlines under varying decision thresholds.

Problem

Research questions and friction points this paper is trying to address.

clickbait detection

semantic embeddings

heuristic features

inference efficiency

headline classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

clickbait detection

semantic embeddings

graph neural networks