Stance Detection of Controversial Articles Using TF-IDF and BERT

Eka Parima Saragih; Anggraini Dyah Ayu  Sekarlangit; Faqih Al  Suman

doi:10.18196/jet.v9i1.26965

Authors

Eka Parima Saragih Master Program of Science in Information Technology (MSIT), Faculty of Computer Science, President University
Anggraini Dyah Ayu Sekarlangit Master Program of Science in Information Technology (MSIT), Faculty of Computer Science, President University
Faqih Al Suman Master Program of Science in Information Technology (MSIT), Faculty of Computer Science, President University

DOI:

https://doi.org/10.18196/jet.v9i1.26965

Keywords:

BERT, Fake News Challenge, Hybrid Model, Stance Detection, TF-IDF

Abstract

Online misinformation and polarized discussions require better methods for automatically detecting a text's stance. As digital content increases, identifying whether a news article supports, opposes, or is neutral towards its headline is crucial for fighting the spread of false information. This study presents a hybrid model designed for this task. We combine lexical features from Term Frequency-Inverse Document Frequency (TF-IDF), which captures word-level patterns, with contextual semantic information from a pretrained BERT model (bert-base-uncased). The features from both TF-IDF and BERT's [CLS] token were concatenated and used to train a logistic regression classifier. The model was trained and tested on a filtered version of the Fake News Challenge (FNC-1) dataset, with "unrelated" pairs removed to focus on more nuanced stance classification. The final evaluation of this model achieved 83% accuracy with a macro F1-score of 0.68. This model evaluates best in the Neutral stance (F1-score 0.91), but has some difficulty detecting the stance in the Oppositional class (with an F1-score 0.39). The results of this evaluation show that surface level lexical features combined with deep contextual understanding can improve the performance of stance detection.

References

D. M. J. Lazer et al., "The science of fake news," Science, vol. 359, no. 6380, pp. 1094–1096, 2018, doi: 10.1126/science.aao2998.

S. Vosoughi, D. Roy, and S. Aral, "The spread of true and false news online," Science, vol. 359, no. 6380, pp. 1146–1151, 2018, doi: 10.1126/science.aap9559.

M. Yari Zanganeh and N. Hariri, "The role of emotional aspects in the information retrieval from the web," Online Information Review, vol. 42, no. 4, pp. 520–534, 2018, doi: 10.1108/OIR-04-2016-0121.

N. K. Negied et al., "Academic assistance chatbot—a comprehensive NLP and deep learning-based approaches," Indonesian Journal of Electrical Engineering and Computer Science, vol. 33, no. 2, pp. 1042–1056, 2024, doi: 10.11591/ijeecs.v33.i2.pp1042-1056.

H. Wu et al., "Result diversification in search and recommendation: A survey," IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 10, pp. 5354–5373, 2024, doi: 10.1109/TKDE.2024.3382262.

K. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu, "Fake news detection on social media: A data mining perspective," ACM SIGKDD Explorations Newsletter, vol. 19, no. 1, pp. 22–36, 2017, doi: 10.1145/3137597.3137600.

Y. Ajjour, "Addressing controversial topics in search engines," Ph.D. dissertation, Bauhaus-Universität Weimar, 2023, doi: 10.25643/BAUHAUS-UNIVERSITAET.6403.

P. Khandelwal, P. Singh, R. Kaur, and R. Chakraborty, "Stance detection in Twitter conversations using reply support classification," in Proceedings of the 14th International Conference on Pattern Recognition Applications and Methods, 2025, pp. 235–242, doi: 10.5220/0013129800003905.

L. Mascarell et al., "Stance detection in German news articles," ETH Zürich, 2021, doi: 10.3929/ETHZ-B-000523833.

Y. Zhang et al., "Stance-level sarcasm detection with BERT and stance-centered graph attention networks," ACM Transactions on Internet Technology, vol. 23, no. 2, pp. 1–21, 2023, doi: 10.1145/3533430.

S. Ng et al., "Stance classification: A comparative study and use case on Australian parliamentary debates," Journal of Computational Social Science, vol. 8, no. 2, p. 43, 2025, doi: 10.1007/s42001-025-00366-y.

H. Karande et al., "Stance detection with BERT embeddings for credibility analysis of information on social media," PeerJ Computer Science, vol. 7, p. e467, 2021, doi: 10.7717/peerj-cs.467.

Z. Elena, "Automatic stance detection on political discourse in Twitter," M.S. thesis, University of the Basque Country.

J. Mina, "Evaluation of text transformers for classifying sentiment of reviews by using TF-IDF, BERT (word embedding), SBERT (sentence embedding) with support vector machine evaluation," M.S. dissertation, Technological University Dublin, 2023.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint, 2018, doi: 10.48550/ARXIV.1810.04805.

J. Ramos, "Using TF-IDF to determine word relevance in document queries," in First Instructional Conference on Machine Learning, Rutgers University, 2003.

S. Tadesse Guda, "Political stance detection on Amharic text using machine learning," M.S. thesis, St. Mary’s University

V. V. V. R. Gurram, "Automated detection of fake news in natural language processing: A comparative study of TF-IDF and lexical-based stance detection with logistic regression," B.S. thesis, Blekinge Institute of Technology.

B. Schiller, J. Daxenberger, and I. Gurevych, "Stance detection benchmark: How robust is your stance detection?" KI - Künstliche Intelligenz, vol. 35, no. 3–4, pp. 329–341, 2021, doi: 10.1007/s13218-021-00714-w.

I. Alsmadi, I. Alazzam, M. Al-Ramahi, and M. Zarour, "Stance detection in the context of fake news—A new approach," Future Internet, vol. 16, no. 10, p. 364, 2024, doi: 10.3390/fi16100364.

B. Zhang et al., "A survey of stance detection on social media: New directions and perspectives," arXiv preprint arXiv:2409.15690, 2024, doi: 10.48550/arXiv.2409.15690.

M. I. Alfarizi, L. Syafaah, and M. Lestandy, "Emotional text classification using TF-IDF (term frequency-inverse document frequency) and LSTM (long short-term memory)," JUITA: Jurnal Informatika, vol. 10, no. 2, p. 225, 2022, doi: 10.30595/juita.v10i2.13262.

S. Pathiyan Cherumanal, D. Spina, F. Scholer, and W. B. Croft, "Evaluating fairness in argument retrieval," in Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021, pp. 3363–3367, doi: 10.1145/3459637.3482099.

P. Bourgonje, J. M. Schneider, and G. Rehm, "From clickbait to fake news detection: An approach based on detecting the stance of headlines to articles," in Natural Language Processing Meets Journalism, 2017, pp. 84–89.

S. Ghosh, P. Singhania, S. Singh, K. Rudra, and S. Ghosh, "Stance detection in web and social media: A comparative study," in Lecture Notes in Computer Science, vol. 11696, 2019, pp. 75–87, doi: 10.1007/978-3-030-28577-7_4.

W. Antoun, F. Baly, and H. Hajj, "AraBERT: Transformer-based model for Arabic language understanding," arXiv preprint, 2021, doi: 10.48550/arXiv.2003.00104.

V. Slovikovskaya, "Transfer learning from transformers to fake news challenge stance detection (FNC-1) task," arXiv preprint arXiv:1910.14353, 2019, doi: 10.48550/arXiv.1910.14353.

C. Dulhanty, J. L. Deglint, I. B. Daya, and A. Wong, "Taking a stance on fake news: Towards automatic disinformation assessment via deep bidirectional transformer language models for stance detection," arXiv preprint arXiv:1911.11951, 2019, doi: 10.48550/arXiv.1911.11951.

N. Kausar, A. AliKhan, and M. Sattar, "Towards better representation learning using hybrid deep learning model for fake news detection," Social Network Analysis and Mining, vol. 12, no. 1, p. 165, 2022, doi: 10.1007/s13278-022-00986-6.

E. Essa, K. Omar, and A. Alqahtani, "Fake news detection based on a hybrid BERT and LightGBM models," Complex & Intelligent Systems, vol. 9, no. 6, pp. 6581–6592, 2023, doi: 10.1007/s40747-023-01098-0.

T. Aljrees et al., "Fake news stance detection using selective features and FakeNET," PLOS ONE, vol. 18, no. 7, p. e0287298, 2023, doi: 10.1371/journal.pone.0287298.

M. Farokhian, V. Rafe, and H. Veisi, "Fake news detection using parallel BERT deep neural networks," Multimedia Tools and Applications, vol. 83, no. 15, pp. 43831–43848, 2023, doi: 10.1007/s11042-023-17115-w.

S. Gong et al., "Fake news detection through graph-based neural networks: A survey," 2023, doi: 10.21203/rs.3.rs-3252100/v1.

A. K. Yadav et al., "Fake news detection using hybrid deep learning method," TechRxiv, 2022, doi: 10.36227/techrxiv.19689844.

Stance Detection of Controversial Articles Using TF-IDF and BERT

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Information