Abstract
The internet is a vital invention with a
vast user base using it for various purposes. Social media platforms allow any
user to post or spread news without verification, leading some to spread fake
news as propaganda against individuals, organizations, or political parties.
Since humans cannot detect all fake news, there is a need for machine learning
classifiers to automate this process.
Keywords: Online fake news, Machine learning, Text Classification, social media
1.
Introduction
While the digital world offers many
advantages, it also facilitates the spread of fake news to harm reputations or
spread propaganda. Online platforms like Facebook and Twitter are common
grounds for this. Machine learning (ML), a part of artificial intelligence,
enables systems to learn and perform tasks like prediction and detection
through supervised, unsupervised, or reinforcement algorithms trained on
datasets.
Detecting fake news is a significant challenge because users often believe and spread it without verification. This has real-world consequences, such as affecting opinions and decisions in the 2016 US election. Researchers are increasingly using ML algorithms to automate detection as the volume of fake news grows over time.
2.
Methodology
This paper uses a Systematic Literature
Review (SLR) methodology to answer specific research questions by collecting
and citing papers from various databases.
Exclusion and Inclusion Criteria: To ensure relevance, the following criteria were applied:
Table 1: Exclusion and Inclusion Criteria.
|
Exclusion Criteria |
Inclusion Criteria |
|
Language is not English |
Written in the English language |
|
Complete paper is not accessible |
Paper can be accessed completely |
|
Not related to ML and fake news |
Related to ML and fake news detection |
Quality Assessment Papers were
assessed based on their discussion of machine learning use for fake or false
news detection. Out of 73 papers initially collected, 26 were selected for this
review.
3.
Research Questions
The SLR aims to answer:
• Why machine
learning is required to detect the fake news?
• Which machine
learning supervised classifiers can be used for detecting fake news?
• How classifiers
of machine learning are trained to detect fake news?
4.
Search Process
The search involved databases including
Clarivate Analytics (WoS), ACM Digital Library, IEEE Xplore, and Elsevier
(Scopus). The process followed these steps:
Ø Search Keywords
Ø Title &
Abstract Exclusion
Ø Conclusion &
Full Text Exclusion
Ø Selection of
Primary Studies
5.
Result and Discussion
Why machine
learning is required?
Controlling fake news is mandatory because
it can affect businesses, individuals, and political parties. Manual detection
is difficult as most people do not know the full story behind a news item. ML allows
for automatic and easy detection by checking post content against trained
models.
Which supervised
classifiers can be used?
Research indicates several effective
classifiers:
Support Vector Machine (SVM): A supervised
algorithm often cited for high accuracy in classification.
• Naïve Bayes: Used for checking
authenticity; reported accuracies reach as high as 96.08%.
• Logistic Regression: Useful for
categorical predictions (True/Fake).
Random Forests: Operates by using
a "majority vote" from different random trees.
• Neural Networks & RNNs: Used for deep
learning-based classification.
• K-Nearest Neighbor (KNN): Classifies news
based on similarity to stored cases.
• Decision Tree: Breaks down
datasets into smaller subsets for detection.
How are
classifiers trained?
Training is crucial for accuracy. The
process generally follows:
1. Dataset Collection: Using labeled
news data.
2. Preprocessing: Removing “stop
words” and performing “stemming” (transforming words to their single form).
3. Feature Extraction: Using models like
TF-IDF, N-Gram, or Bag of Words to extract valuable data.
4. Splitting Data: Dividing into
Training and Test datasets.5. Experimental Results: Evaluating the
classifier's performance.
6.
Conclusion
Fake news on social media can destroy
reputations and manipulate political opinions. Because platforms do not
restrict or verify posters, ML classifiers are necessary to detect these posts
automatically after being trained on labeled datasets. Future research may
focus on unsupervised machine learning since labeled data is often
difficult to obtain.
7.
References
1.
K
Shu. Fake News Detection on social media: A Data Mining Perspective. 2017.
2.
H
Allcott, M Gentzkow. Social Media and Fake News in the 2017 Election. 2017.
3.
J
Devlin. BERT: Pretraining of Deep Bidirectional Transformers for Language
Understanding. 2019.