Cost-effective Cloud-Based Big Data Mining with K-means Clustering: An Analysis of Gaussian Data
Keywords:
Big Data Mining, Cloud Computing, K-means Clustering, Gaussian Data, Cost Optimization, Lloyd’s Algorithm.Abstract
Extrapolating significant insights from massive datasets is essential in today's data-driven society. Cost-effectiveness
and scalability have increased with big data mining thanks to cloud computing. The analysis of Gaussian data in a
cloud computing environment using K-means clustering is investigated in this work. We tested the impact of different
cluster sizes (k) on computation time and accuracy by implementing Lloyd's K-means method in our tests. According
to our results, the algorithm can be stopped early at high (albeit not perfect) accuracy levels, resulting in significant
cost savings. In order to enhance clustering performance and cost-efficiency, the study highlights how crucial it is to
choose the best beginning centers and manage resources intelligently. By using these doable tactics, companies may
leverage complex analytics without having to pay outrageous prices.