Issue |
E3S Web Conf.
Volume 430, 2023
15th International Conference on Materials Processing and Characterization (ICMPC 2023)
|
|
---|---|---|
Article Number | 01081 | |
Number of page(s) | 15 | |
DOI | https://doi.org/10.1051/e3sconf/202343001081 | |
Published online | 06 October 2023 |
Clickbait Post Detection using NLP for Sustainable Content
1 Department of Information Technology, Gokaraju Rangaraju Institute of Engineering and Technology, India
2 School of Applied and Life Sciences, Uttaranchal University, Dehradun, 248007, India
3 KG Reddy College of Engineering & Technology, India
* Corresponding author: nvgraju@griet.ac.in
Clickbait is a significant problem on online media platforms. It misleads users and manipulates their engagement. A user who clicks on a clickbait link may be taken to a website full of ads, or that requires them to pay for something. The goal of this project is to create a system that can recognize clickbait posts so that user can access only to sustainable content. The system will analyze data using natural language processing (NLP) and machine learning techniques. NLP pre-processing techniques, such as tokenization, lemmatization, and stemming, will be utilized to extract essential elements from the headlines. These features will subsequently be used to train a machine learning model, specifically a supervised classifier, to distinguish between clickbait and non-clickbait news headlines. The project will explore a range of algorithms and techniques, including popular text representation models such as TF-IDF or word embeddings, as well as classifier models like logistic regression or random forests. The model will be evaluated using a variety of metrics, such as Accuracy, Precision, Recall, and F1 score. By making it easier for users to identify clickbait, the system can help to reduce the amount of time and money wasted on this type of content.
© The Authors, published by EDP Sciences, 2023
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.