Clickbait, known for its diverse linguistic structures and enticing yet sometimes misleading titles, challenges machine detection. This study addresses the issue in social media headlines using advanced machine learning and deep learning techniques. The Naive Bayes algorithm for machine learning and Long Short-Term Memory (LSTM) for deep learning, focusing on dynamic language methods was utilized. Our framework categorized headlines precisely, aided by interpretability tools like LIME, which illuminated relevant variables. Data pre-processing involved text normalization techniques, and word frequency tables helped identify common phrasing patterns. Our research found that LSTM outperforms Naive Bayes in clickbait detection, excelling in accuracy, precision, recall, and F1-scores. LSTM's success lies in grasping language subtleties and sequential dependencies. However, machine learning methods, empowered by LIME-like tools, remain valuable for better interpretability. Future strategies for clickbait detection may balance machine learning for interpretability and deep learning for accuracy, with further exploration into enhancing deep learning model comprehensibility and applicability across diverse fields and datasets.
Parita JainSwati SharmaMônica MônicaPuneet Aggarwal
Ying-Lung LinSiyuan LuLiang-Chih Yu
Nurshaheeda Shazleen YusleeNur Atiqah Sia Abdullah