Canopy with k-means clustering algorithm for big data analytics

Sagheer, Noor S.; Yousif, Suhad A.

Canopy with k-means clustering algorithm for big data analytics

Tarih

2021

Yazarlar

Sagheer, Noor S.

Yousif, Suhad A.

Yayıncı

Maltepe Üniversitesi

Erişim Hakkı

CC0 1.0 Universal
info:eu-repo/semantics/openAccess

Özet

. Recently, Big Data is gathered from various sources in different types, and it is not easy to analyze them by traditional methods. Apache Hadoop is a robust solution to the problems of saving and processing large datasets by providing HDFS (Hadoop Distributed File System) and MapReduce for storing and processing data. One of the essential methods for analyzing big data to discover new patterns is the clustering algorithms. In this paper, we have used the canopy clustering algorithm provided by Distributed Machine Learning with Apache Mahout as preprocessing step for the k-means clustering algorithm. The results showed that using Canopy as a preprocessing step has sped up the time of managing the massive scale of the healthcare insurance dataset, and it also reduces the execution time of the k-means by providing initial centroids for the given dataset.

Anahtar Kelimeler

Big Data, k-means, canopy, Mahout, Health Care, confusion matrix, HDFS

Kaynak

Fourth International Conference of Mathematical Sciences

Künye

Sagheer, N.S. ve Yousif, S.A. (2021). Canopy with k-means clustering algorithm for big data analytics. Fourth International Conference of Mathematical Sciences, Maltepe Üniversitesi. s. 1-4.

Bağlantı

https://aip.scitation.org/doi/10.1063/5.0042398
https://hdl.handle.net/20.500.12415/1941

Koleksiyon

İnsan ve Toplum Bilimleri Fakültesi Koleksiyonu

Detaylı Öğe Kaydı

Canopy with k-means clustering algorithm for big data analytics

Tarih

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

Araştırma projeleri

Organizasyon Birimleri

Dergi sayısı

Özet

Açıklama

Anahtar Kelimeler

Kaynak

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye

Bağlantı

Koleksiyon