Institutional Repository

Application of improved density-based algorithms to data stream and performance evaluation

Show simple item record

dc.contributor.advisor Zenghui, Wang en
dc.contributor.advisor Tabane, Elias en
dc.contributor.author Akinosho, Tajudeen Akanbi
dc.date.accessioned 2024-05-13T09:37:22Z
dc.date.available 2024-05-13T09:37:22Z
dc.date.issued 2023-10
dc.identifier.uri https://hdl.handle.net/10500/31196
dc.description.abstract Density-based algorithms are effective in the detection of clusters with arbitrary shapes and outliers even when information about the number of clusters is not available. Parameter specification in data stream clustering remains a challenge. Selecting a suitable parameter-tuning is germane in having a good clustering quality. The density-based algorithm DenStream is an example of data stream clustering algorithms that require several parameter specifications. In this dissertation, an improved DenStream with a modified distance measure was proposed and demonstrated with parameter-tuning in Massive Online Analysis (MOA) using synthetic and real-world datasets. The modified DenStream algorithm was compared against CluStream, ClusTree and DenStream in the presence of noise levels 0%, 10%, and 30% and manually selected epsilon parameters 0.02, 0.03, and 0.05 respectively. The epsilon parameter range [0.02 – 0.05] was not used due to some algorithm not working on real-world datasets. The effects on clustering qualities were evaluated and demonstrated using performance evaluation metrics CMM, Purity, Silhouette Coefficient, and Rand index on the synthetic and real-world datasets. Finally, the result shows that effectiveness of the algorithms depends on the parameter-tuning and no single algorithm is a one-size-fits-all for the performance metrics. en
dc.language.iso en en
dc.subject Data stream clustering en
dc.subject Stream clustering en
dc.subject Data stream en
dc.subject Clustering en
dc.subject MOA en
dc.subject Clusters en
dc.subject CluStream en
dc.subject DenStream en
dc.subject ClusTree en
dc.subject Modified DenStream en
dc.subject Arbitrary shape en
dc.subject SDG 9 Industry, Innovation and Infrastructure en
dc.title Application of improved density-based algorithms to data stream and performance evaluation en
dc.type Dissertation en
dc.description.department School of Computing en
dc.description.degree M. Sc. (Computing)


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search UnisaIR


Browse

My Account

Statistics