Generalized Estimating Equations and Gaussian Estimation in Longitudinal Data Analysis

Xuemao Zhang, University of Windsor


In this dissertation, we first develop a Gaussian estimation procedure for the estimation of regression parameters in correlated (longitudinal) binary response data using working correlation matrix and compare this method with the GEE (generalized estimating equations) method and the weighted GEE method. A Newton-Raphson algorithm is derived for estimating the regression parameters from the Gaussian likelihood estimating equations for known correlation parameters. The correlation parameters of the working correlation matrix are estimated by the method of moments. Consistency properties of the estimators are discussed. A simulation comparison of efficiency of the Gaussian estimates and the GEE estimates of the regression parameters shows that the Gaussian estimates using the unstructured correlation matrix of the responses for a subject are, in general, more efficient than those by the other methods compared. The next best are the Gaussian estimates using the general autocorrelation structure. Two data sets are analyzed and a discussion is given. The main advantage of GEE is its asymptotic unbiased estimation of the marginal regression coefficients even if the correlation structure is misspecified. However, the technique requires that the sample size should be large. In this dissertation, two bias corrected GEE estimators of the regression parameters in longitudinal data are proposed when the sample size is small. Simulations show that the proposed methods do well in reducing bias and have, in general, higher efficiency than the GEE estimates. Two examples are analyzed and a discussion is given. The current GEE method focuses on the modeling of the working correlation matrix assuming a known variance function. However, Wang and Lin (2005) showed that if the variance function is misspecified, the correct choice of the correlation structure may not necessarily improve estimation efficiency for the regression parameters. In this dissertation, we propose a GEE approach to estimate the variance parameters when the form of the variance function is known. This estimation approach borrows the idea of Davidian and Carroll (1987) by solving a non-linear regression problem where residuals are regarded as the responses and the variance function is regarded as the regression function. Simulations show that the proposed method performs as well as the modified pseudolikelihood approach developed by Wang and Zhao (2007).