Enhancing Text-based Spam Detection using Machine Learning
DOI:
https://doi.org/10.65091/icicset.v2i1.17Abstract
This research focuses on developing an intelligent and accurate system for text-based spam detection using advanced machine learning models. With the exponential growth of digital communication, spam messages have become a major issue. Core research problem include spam messages have become a major issue, often carrying misleading, fraudulent, or irrelevant content that disrupts user experience and security. The methodology involves systematic data preprocessing followed by feature extraction using TF-IDF vectorization. Several traditional models — Naive Bayes, Logistic Regression, Decision Tree, Random Forest, and Linear SVM (Calibrated) along with a Long Short-Term Memory (LSTM) model were trained and evaluated. The comparative analysis demonstrated that the LinIear SVM (Calibrated) model achieved the best overall performance among all tested algorithms, showing the highest accuracy, balanced precision-recall values, and the lowest error rates. This outcome confirms the effectiveness of combining advanced preprocessing, TF-IDF feature extraction, and hybrid machine learning techniques for spam detection. It also bridges the gap between traditional machine learning and deep learning approaches, providing a scalable foundation for real-time spam filtering. Furthermore, the study contributes to digital communication security by offering a reliable system capable of detecting and reducing unwanted or malicious text messages efficiently.