Improving spam email classification accuracy using ensemble techniques: a stacking approach

Adnan, Muhammad; Imam, Muhammad Osama; Javed, Muhammad Furqan; Murtza, Iqbal

dc.contributor.author	Adnan, Muhammad
dc.contributor.author	Imam, Muhammad Osama
dc.contributor.author	Javed, Muhammad Furqan
dc.contributor.author	Murtza, Iqbal
dc.date.accessioned	2023-11-22T13:29:03Z
dc.date.available	2023-11-22T13:29:03Z
dc.date.issued	2023-09-20
dc.description.abstract	Spam emails pose a substantial cybersecurity danger, necessitating accurate classification to reduce unwanted messages and mitigate risks. This study focuses on enhancing spam email classification accuracy using stacking ensemble machine learning techniques.We trained and tested five classifiers: logistic regression, decision tree, K-nearest neighbors (KNN), Gaussian naive Bayes and AdaBoost. To address overfitting, two distinct datasets of spam emails were aggregated and balanced. Evaluating individual classifiers based on recall, precision and F1 score metrics revealed AdaBoost as the top performer. Considering evolving spam technology and new message types challenging traditional approaches, we propose a stacking method. By combining predictions from multiple base models, the stacking method aims to improve classification accuracy. The results demonstrate superior performance of the stacking method with the highest accuracy (98.8%), recall (98.8%) and F1 score (98.9%) among tested methods. Additional experiments validated our approach by varying dataset sizes and testing different classifier combinations. Our study presents an innovative combination of classifiers that significantly improves accuracy, contributing to the growing body of research on stacking techniques. Moreover, we compare classifier performances using a unique combination of two datasets, highlighting the potential of ensemble techniques, specifically stacking, in enhancing spam email classification accuracy. The implications extend beyond spam classification systems, offering insights applicable to other classification tasks. Continued research on emerging spam techniques is vital to ensure long-term effectiveness.	en_US
dc.identifier.citation	Adnan, Imam, Javed, Murtza. Improving spam email classification accuracy using ensemble techniques: a stacking approach. International Journal of Information Security. 2023	en_US
dc.identifier.cristinID	FRIDAID 2189233
dc.identifier.doi	10.1007/s10207-023-00756-1
dc.identifier.issn	1615-5262
dc.identifier.issn	1615-5270
dc.identifier.uri	https://hdl.handle.net/10037/31848
dc.language.iso	eng	en_US
dc.publisher	Springer Nature	en_US
dc.relation.journal	International Journal of Information Security
dc.rights.accessRights	openAccess	en_US
dc.rights.holder	Copyright 2023 The Author(s)	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0	en_US
dc.rights	Attribution 4.0 International (CC BY 4.0)	en_US
dc.title	Improving spam email classification accuracy using ensemble techniques: a stacking approach	en_US
dc.type.version	publishedVersion	en_US
dc.type	Journal article	en_US
dc.type	Tidsskriftartikkel	en_US
dc.type	Peer reviewed	en_US

Tilhørende fil(er)

Navn:: article.pdf
Størrelse:: 1.507Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Artikler, rapporter og annet (teknologi og sikkerhet) [363]

Vis enkel innførsel

Med mindre det står noe annet, er denne innførselens lisens beskrevet som Attribution 4.0 International (CC BY 4.0)