JOURNAL ARTICLE

POWER DETERMINATION FOR GEOGRAPHICALLY CLUSTERED DATA USING GENERALIZED ESTIMATING EQUATIONS

Abstract

Study designs in public health research often require the estimation of intervention effects that have been applied to a cluster of subjects in a common geographic area, rather than randomly assigned to individual subjects, and where the outcome is dichotomous. Statistical methods that account for the intracluster correlation of measurements must be used or the standard errors of regression coefficients will be under-estimated. Generalized estimating equations (GEE) can be used to account for this correlation, although there are no straightforward methods to determine sample-size requirements for adequate power. A simulation study was performed to calculate power in a GEE model for a proposed study of the effect of an intervention, designed to reduce lower-back injuries among nursing personnel employed in nursing homes. Nursing homes will be randomly assigned to either an intervention or control group and all employees within a nursing home will be treated alike. Historical injury data indicates that the baseline-injury risk for each home can be reasonably modelled using a beta distribution. It is assumed that the risk for any individual nurse within a nursing home follows a Bernoulli probability distribution expressed as a logit function of fixed covariates, which have values of odds ratios determined from previous studies which represent characteristics of the study population, and a random-intercept term which is specific for each home. Results indicate that failure to account for intracluster correlation can lead to overestimates of power as well as inflation of type I error by as much as 20 per cent. Although the GEE method accounted for the intracluster correlation when present, estimates of the intracluster correlation were negatively biased when no intracluster correlation was present. In addition, and possibly related to the negatively biased estimates of intracluster correlation, we also found inflated type I error estimates from the GEE method.

Keywords:
Gee Generalized estimating equation Statistics Sample size determination Covariate Estimating equations Odds ratio Population Mathematics Logistic regression Medicine Maximum likelihood

Metrics

21
Cited By
2.45
FWCI (Field Weighted Citation Impact)
0
Refs
0.90
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Health Systems, Economic Evaluations, Quality of Life
Social Sciences →  Economics, Econometrics and Finance →  Economics and Econometrics
Healthcare Policy and Management
Social Sciences →  Economics, Econometrics and Finance →  Economics and Econometrics
Health disparities and outcomes
Social Sciences →  Social Sciences →  Health

Related Documents

JOURNAL ARTICLE

Analyzing Cross-Sectionally Clustered Data Using Generalized Estimating Equations

Francis L. Huang

Journal:   Journal of Educational and Behavioral Statistics Year: 2021 Vol: 47 (1)Pages: 101-125
JOURNAL ARTICLE

Semiparametric Regression for Clustered Data Using Generalized Estimating Equations

Xihong LinRaymond J. Carroll

Journal:   Journal of the American Statistical Association Year: 2001 Vol: 96 (455)Pages: 1045-1056
JOURNAL ARTICLE

Extended Generalized Estimating Equations for Clustered Data

Daniel B. HallThomas A. Severini

Journal:   Journal of the American Statistical Association Year: 1998 Vol: 93 (444)Pages: 1365-1375
JOURNAL ARTICLE

Extended Generalized Estimating Equations for Clustered Data

Daniel B. HallThomas A. Severini

Journal:   Journal of the American Statistical Association Year: 1998 Vol: 93 (444)Pages: 1365-1365
© 2026 ScienceGate Book Chapters — All rights reserved.