Confounding and measurement error correction in epidemiological research
Besides data that is primarily collected for research, in biomedical research, multiple additional sources of data, from e.g. electronic healthcare records, registries, and biobanks, are increasingly being combined into large research databases. Many are convinced of the ample opportunities this will bring for epidemiologic research, for example to study the effects of medical interventions (or treatments) and prediction of their effects. However, much of these secondary or daily care data are not collected specifically for research purposes and arise from daily practice. Here, for example, allocation of treatments is of course not a random process. As a result, obtaining valid and reliable estimates of treatment effects from these additional data sources, will crucially depend on proper consideration of methodological issues such as measurement error and confounding. Essential unanswered questions regarding both of these topics need to be addressed to reduce the risk of bias when conducting research using these additional, secondary databases. The studies presented in this thesis aimed to provide further insight on how to minimize the risk of bias when evaluating the effects of (multiple) treatments using large routine care databases. In particular, methods for confounding and measurement error adjustment were evaluated separately as well as combined. In Chapters 2 and 3 simulation studies showed that propensity score (PS) methods can be considered viable alternatives to regression based methods when adjusting for confounding in multi-treatment settings. Considering the inherent benefits of using PS methods for confounding adjustment, the findings of these two chapters suggest that these methods, and PS adjustment in particular, are appropriate alternatives to traditional logistic regression analysis when profiling many providers or comparing the effectiveness of multiple treatment options. The attention given to measurement error and its impact on estimated causal associations or the validation of clinical prediction models was discussed in Chapters 4 through 6. These three chapters demonstrate that increased awareness about the potentially important, yet often unpredictable, impact of measurement error (in covariates or predictors) is necessary. Together with additional guidance on the use of measurement error correction methods, this may stimulate researchers to account for potential measurement error in medical research. In addition, researchers should be wary that routine healthcare data may contain variables with substantial measurement error. In Chapter 7, methods were compared to adjust for a confounder (specifically disease severity) that is measured differently across centers in multicenter studies of medical treatments. In a simulation study, multiple scenarios were investigated in which the availability of different disease severity measures varied across centers depending on center level characteristics. A method based on multiple imputation of missing confounder information was most accurate in estimating the treatment effect. In Chapter 8, an existing Bayesian sample size estimation method was adapted to facilitate interim sample size re-estimations based on an estimate of the variance of the outcome. Using simulations, it was shown how this method accurately estimated the required sample size while controlling frequentist performance by employing power priors.