Date of Award
1998
Publication Type
Master Thesis
Degree Name
M.Sc.
Department
Computer Science
Keywords
Computer Science.
Supervisor
Morrissey, J. M.,
Rights
info:eu-repo/semantics/openAccess
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Abstract
Query processing in a distributed system requires the transmissions of data between computers in a network. For a given query, there exist several strategies that a distributed DBMS may choose to execute in order to generate results. All these strategies produce the same result but incur different query execution costs. The problem of finding an optimal query execution strategy in distributed database systems has been shown to be NP-hard [WC96]. Due to this, heuristic algorithms are necessary in solving the query processing problem but no matter what heuristics are used, the efficiency of executing a query depends heavily on the type of reducer used. Employing the AHY Algorithm [AHY83] to decompose general queries into schedules of simple queries and using the hypothesis in the paper by Tseng and Chen [TC92], this thesis proposes two replacement algorithms; the first replaces all the semijoins in the schedules with hash-semijoins and the second replaces some of the semijoins in the schedule with hash-semijoins depending on which is cost-effective. Also, two heuristic algorithms are proposed based on the same heuristic as the AHY Algorithm (total Time); Algorithm AHY-H which generates schedules using hash-semijoin operator and Algorithm AHY-HS which generates schedules using either semijoin or hash-semijoin operator depending on which is cost-effective. This work also evaluates the performance of the proposed algorithms of the proposed algorithms in comparison to the AHY Algorithm (total time). From our comparison study, we show that both Algorithm AHY-HS and AHY-H, efficiently reduce the cost of transmitting data in comparison to the AHY Algorithm and the hypothesis suggested in [TC92]. Also, there is no significant difference m performance between the two heuristic algorithm, Algorithm AHY-H and AHY-HS, using our experimental framework. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis1998 .O38. Source: Masters Abstracts International, Volume: 39-02, page: 0530. Adviser: J. M. Morrissey. Thesis (M.Sc.)--University of Windsor (Canada), 1998.
Recommended Citation
Ogunbadejo, Olumuyiwa., "Experiments with hash-semijoins in distributed query processing." (1998). Electronic Theses and Dissertations. 2032.
https://scholar.uwindsor.ca/etd/2032