Date of Award
2009
Publication Type
Doctoral Thesis
Degree Name
Ph.D.
Department
Mathematics and Statistics
Keywords
Pure sciences, Count data, Ratio estimators, Regression estimators, Variance function
Supervisor
Sudhir Paul
Rights
info:eu-repo/semantics/openAccess
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Abstract
Clustered binary data arise in many fields such as epidemiology, toxicology, econometrics and pharmacokinetics modelling. For instance, in many epidemiological studies the purpose of the investigation is to compare the risk experienced between two groups where each group has clustered observations. Several methods have been developed in the literature for interval estimation of epidemiological indices such as the risk difference, the risk ratio and the relative difference. In this dissertation we introduce two very simple methods. One of these is based on an estimator of the variance of a ratio estimator and the other is based on a sandwich estimator of the variance of the regression estimator using the generalized estimating equations (GEE) approach. We then compare these two methods, by simulation, in terms of maintaining nominal coverage probability and average coverage length, with the four methods discussed earlier in the literature. It is shown that the methods based on an estimator of the variance of ratio estimate performs better in terms of coverage probability, symmetry and bias. The proposed methods are then applied to analyze toxicological and educational intervention program datasets.
The phenomenon of overdispersion is also quite common in count data. Overdispersion is suspected when the variance is larger than the mean. In semi-parametric analysis of overdispersed count data, one often needs to determine an appropriate variance function (mean variance relationship). For example, in chemical and biological assay problems, to control the quality of techniques, one has to adjust the levels of experimental factors to bring the mean response to a target value while minimizing variance. The emphasis is on problems involving simultaneous consideration of both mean and variance where the latter may be a function of the former. In this dissertation, by using a hypothesis testing approach through a broader class of models and a data analytic approach, we propose an appropriate mean-variance relationship which can be used in the semi-parametric analysis of count data.
Recommended Citation
Zaihra, Tasneem, "Inference on some epidemiological indices and variance function in semi-parametric analysis of count data" (2009). Electronic Theses and Dissertations. 7888.
https://scholar.uwindsor.ca/etd/7888