Date of Award
Kent, Robert D.
Applied sciences, Performance, Prediction, Runtime, Scalability, Adaptive scheduling
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.
A job scheduler determines the order and duration of the allocation of resources, e.g. CPU, to the tasks waiting to run on a computer. Round-Robin and First-Come-First-Serve are examples of algorithms for making such resource allocation decisions. Parallel job schedulers make resource allocation decisions for applications that need multiple CPU cores, on computers consisting of many CPU cores connected by different interconnects. An adaptive parallel scheduler is a parallel scheduler that is capable of adjusting its resource allocation decisions based on the current resource usage and demand. Adaptive parallel schedulers that decide the numbers of CPU cores to allocate to a parallel job provide more flexibility and potentially improve performance significantly for both local and grid job scheduling compared to non-adaptive schedulers. A major reason why adaptive schedulers are not yet used practically is due to lack of knowledge of the scalability curves of the applications, and high cost of existing white-box approaches for scalability prediction. We show that a runtime and scalability prediction tool can be developed with 3 requirements: accuracy comparable to white-box methods, applicability, and robustness. Applicability depends only on knowledge feasible to gain in a production environment. Robustness addresses anomalous behaviour and unreliable predictions. We present ADEPT, a speedup and runtime prediction tool that satisfies all criteria for both single problem size and across different problem sizes of a parallel application. ADEPT is also capable of handling anomalies and judging reliability of its predictions. We demonstrate these using experiments with MPI and OpenMP implementations of NAS benchmarks and seven real applications.
Deshmeh, Gholamhossein, "ADEPT Runtime/Scalability Predictor in support of Adaptive Scheduling" (2013). Electronic Theses and Dissertations. 4943.