Date of Award


Publication Type

Master Thesis

Degree Name



Computer Science

First Advisor

Chen, Jessica


big data; set covering algorithms




Set covering is a well-studied classical problem with many applications across different fields. More recent work on this problem has taken into account the parallel computing architecture, the datasets at scale, the properties of the datasets, etc. Within the context of web crawling where the data follow the lognormal distribution, a weighted greedy algorithm has been proposed in the literature and demonstrated to outperform the traditional one. In the present work, we evaluate the performance of the weighted greedy algorithm using an open-source dataset in the context of resource management. The data are sampled from a given roadmap with 1.9 millions of nodes. Our research includes three different cost definitions i.e. location cost, driving cost and infrastructure cost. We also consider the different coverage radius to model possible parameters in the application. Our experiment results show that weighted greedy algorithm outperforms the greedy algorithm by 8% in average for all three different cost definitions.