Stance Detection of Controversial Articles Using TF-IDF and BERT
DOI:
https://doi.org/10.18196/jet.v9i1.26965Keywords:
BERT, Fake News Challenge, Hybrid Model, Stance Detection, TF-IDFAbstract
Online misinformation and polarized discussions require better methods for automatically detecting a text's stance. As digital content increases, identifying whether a news article supports, opposes, or is neutral towards its headline is crucial for fighting the spread of false information. This study presents a hybrid model designed for this task. We combine lexical features from Term Frequency-Inverse Document Frequency (TF-IDF), which captures word-level patterns, with contextual semantic information from a pretrained BERT model (bert-base-uncased). The features from both TF-IDF and BERT's [CLS] token were concatenated and used to train a logistic regression classifier. The model was trained and tested on a filtered version of the Fake News Challenge (FNC-1) dataset, with "unrelated" pairs removed to focus on more nuanced stance classification. The final evaluation of this model achieved 83% accuracy with a macro F1-score of 0.68. This model evaluates best in the Neutral stance (F1-score 0.91), but has some difficulty detecting the stance in the Oppositional class (with an F1-score 0.39). The results of this evaluation show that surface level lexical features combined with deep contextual understanding can improve the performance of stance detection.
References
D. M. J. Lazer et al., "The science of fake news," Science, vol. 359, no. 6380, pp. 1094–1096, 2018, doi: 10.1126/science.aao2998.
S. Vosoughi, D. Roy, and S. Aral, "The spread of true and false news online," Science, vol. 359, no. 6380, pp. 1146–1151, 2018, doi: 10.1126/science.aap9559.
M. Yari Zanganeh and N. Hariri, "The role of emotional aspects in the information retrieval from the web," Online Information Review, vol. 42, no. 4, pp. 520–534, 2018, doi: 10.1108/OIR-04-2016-0121.
N. K. Negied et al., "Academic assistance chatbot—a comprehensive NLP and deep learning-based approaches," Indonesian Journal of Electrical Engineering and Computer Science, vol. 33, no. 2, pp. 1042–1056, 2024, doi: 10.11591/ijeecs.v33.i2.pp1042-1056.
H. Wu et al., "Result diversification in search and recommendation: A survey," IEEE Transactions on Knowledge and Data Engineering, vol. 36, no. 10, pp. 5354–5373, 2024, doi: 10.1109/TKDE.2024.3382262.
K. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu, "Fake news detection on social media: A data mining perspective," ACM SIGKDD Explorations Newsletter, vol. 19, no. 1, pp. 22–36, 2017, doi: 10.1145/3137597.3137600.
Y. Ajjour, "Addressing controversial topics in search engines," Ph.D. dissertation, Bauhaus-Universität Weimar, 2023, doi: 10.25643/BAUHAUS-UNIVERSITAET.6403.
P. Khandelwal, P. Singh, R. Kaur, and R. Chakraborty, "Stance detection in Twitter conversations using reply support classification," in Proceedings of the 14th International Conference on Pattern Recognition Applications and Methods, 2025, pp. 235–242, doi: 10.5220/0013129800003905.
L. Mascarell et al., "Stance detection in German news articles," ETH Zürich, 2021, doi: 10.3929/ETHZ-B-000523833.
Y. Zhang et al., "Stance-level sarcasm detection with BERT and stance-centered graph attention networks," ACM Transactions on Internet Technology, vol. 23, no. 2, pp. 1–21, 2023, doi: 10.1145/3533430.
S. Ng et al., "Stance classification: A comparative study and use case on Australian parliamentary debates," Journal of Computational Social Science, vol. 8, no. 2, p. 43, 2025, doi: 10.1007/s42001-025-00366-y.
H. Karande et al., "Stance detection with BERT embeddings for credibility analysis of information on social media," PeerJ Computer Science, vol. 7, p. e467, 2021, doi: 10.7717/peerj-cs.467.
Z. Elena, "Automatic stance detection on political discourse in Twitter," M.S. thesis, University of the Basque Country.
J. Mina, "Evaluation of text transformers for classifying sentiment of reviews by using TF-IDF, BERT (word embedding), SBERT (sentence embedding) with support vector machine evaluation," M.S. dissertation, Technological University Dublin, 2023.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint, 2018, doi: 10.48550/ARXIV.1810.04805.
J. Ramos, "Using TF-IDF to determine word relevance in document queries," in First Instructional Conference on Machine Learning, Rutgers University, 2003.
S. Tadesse Guda, "Political stance detection on Amharic text using machine learning," M.S. thesis, St. Mary’s University
V. V. V. R. Gurram, "Automated detection of fake news in natural language processing: A comparative study of TF-IDF and lexical-based stance detection with logistic regression," B.S. thesis, Blekinge Institute of Technology.
B. Schiller, J. Daxenberger, and I. Gurevych, "Stance detection benchmark: How robust is your stance detection?" KI - Künstliche Intelligenz, vol. 35, no. 3–4, pp. 329–341, 2021, doi: 10.1007/s13218-021-00714-w.
I. Alsmadi, I. Alazzam, M. Al-Ramahi, and M. Zarour, "Stance detection in the context of fake news—A new approach," Future Internet, vol. 16, no. 10, p. 364, 2024, doi: 10.3390/fi16100364.
B. Zhang et al., "A survey of stance detection on social media: New directions and perspectives," arXiv preprint arXiv:2409.15690, 2024, doi: 10.48550/arXiv.2409.15690.
M. I. Alfarizi, L. Syafaah, and M. Lestandy, "Emotional text classification using TF-IDF (term frequency-inverse document frequency) and LSTM (long short-term memory)," JUITA: Jurnal Informatika, vol. 10, no. 2, p. 225, 2022, doi: 10.30595/juita.v10i2.13262.
S. Pathiyan Cherumanal, D. Spina, F. Scholer, and W. B. Croft, "Evaluating fairness in argument retrieval," in Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021, pp. 3363–3367, doi: 10.1145/3459637.3482099.
P. Bourgonje, J. M. Schneider, and G. Rehm, "From clickbait to fake news detection: An approach based on detecting the stance of headlines to articles," in Natural Language Processing Meets Journalism, 2017, pp. 84–89.
S. Ghosh, P. Singhania, S. Singh, K. Rudra, and S. Ghosh, "Stance detection in web and social media: A comparative study," in Lecture Notes in Computer Science, vol. 11696, 2019, pp. 75–87, doi: 10.1007/978-3-030-28577-7_4.
W. Antoun, F. Baly, and H. Hajj, "AraBERT: Transformer-based model for Arabic language understanding," arXiv preprint, 2021, doi: 10.48550/arXiv.2003.00104.
V. Slovikovskaya, "Transfer learning from transformers to fake news challenge stance detection (FNC-1) task," arXiv preprint arXiv:1910.14353, 2019, doi: 10.48550/arXiv.1910.14353.
C. Dulhanty, J. L. Deglint, I. B. Daya, and A. Wong, "Taking a stance on fake news: Towards automatic disinformation assessment via deep bidirectional transformer language models for stance detection," arXiv preprint arXiv:1911.11951, 2019, doi: 10.48550/arXiv.1911.11951.
N. Kausar, A. AliKhan, and M. Sattar, "Towards better representation learning using hybrid deep learning model for fake news detection," Social Network Analysis and Mining, vol. 12, no. 1, p. 165, 2022, doi: 10.1007/s13278-022-00986-6.
E. Essa, K. Omar, and A. Alqahtani, "Fake news detection based on a hybrid BERT and LightGBM models," Complex & Intelligent Systems, vol. 9, no. 6, pp. 6581–6592, 2023, doi: 10.1007/s40747-023-01098-0.
T. Aljrees et al., "Fake news stance detection using selective features and FakeNET," PLOS ONE, vol. 18, no. 7, p. e0287298, 2023, doi: 10.1371/journal.pone.0287298.
M. Farokhian, V. Rafe, and H. Veisi, "Fake news detection using parallel BERT deep neural networks," Multimedia Tools and Applications, vol. 83, no. 15, pp. 43831–43848, 2023, doi: 10.1007/s11042-023-17115-w.
S. Gong et al., "Fake news detection through graph-based neural networks: A survey," 2023, doi: 10.21203/rs.3.rs-3252100/v1.
A. K. Yadav et al., "Fake news detection using hybrid deep learning method," TechRxiv, 2022, doi: 10.36227/techrxiv.19689844.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Copyright
The Authors submitting a manuscript do so on the understanding that if accepted for publication, copyright of the article shall be assigned to Journal of Electrical Technology UMY. Copyright encompasses rights to reproduce and deliver the article in all form and media, including reprints, photographs, microfilms, and any other similar reproductions, as well as translations.
Authors should sign Copyright Transfer Agreement when they have approved the final proofs sent by the journal prior the publication. JET UMY strives to ensure that no errors occur in the articles that have been published, both data errors and statements in the article.
JET UMY keep the rights to articles that have been published. Authors are permitted to disseminate published article by sharing the link of JET UMY website. Authors are allowed to use their works for any purposes deemed necessary without written permission from JET UMY with an acknowledgement of initial publication in this journal.
License
All articles published in JET UMY are licensed under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA) license. You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms. Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.