Title

Semi-join strategies for total cost minimization in distributed query processing.

Date of Award

1996

Degree Type

Thesis

Degree Name

M.Sc.

Department

Computer Science

First Advisor

Morrissey, J. M.,

Keywords

Computer Science.

Rights

CC BY-NC-ND 4.0

Abstract

A new static heuristic, called Algorithm W is presented as an efficient method for reducing the total volume of data transmitted over the network during distributed query processing. It uses the concepts of profit, marginal profit and gain to construct small, highly selective reducers using cost-effective semi-join sequences. In most cases the heuristic has a complexity of O(nm). A limitation of static strategies, such as Algorithm W is that they rely on accurate estimates to perform properly. The presence of estimation errors may lead to sub-optimal solutions. A solution to this problem is the use of a dynamic strategy (Boderick, 1985; Boderick, Pyra et al., 1989) in which the schedule of operations is monitored and corrected if the performance deteriorates. A purely dynamic heuristic, Algorithm DW is proposed which uses up to date information eliminating the need for schedule monitoring. It is shown that the overheads incurred by using exact information are minimal with respect to the overall total cost. A benchmark database is proposed upon which the empirical performance of the heuristics can be measured. Algorithm W is evaluated against the AHY General (total time) algorithm (Apers, Hevner, Yao, 1983) to investigate whether improvements are possible. The performance of the proposed heuristics are evaluated to test the hypothesis that a dynamic strategy using better estimates will produce improved schedules. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis1995 .B42. Source: Masters Abstracts International, Volume: 34-06, page: 2394. Adviser: J. M. Morrissey. Thesis (M.Sc.)--University of Windsor (Canada), 1996.