Incorporating prior beliefs about selection bias into the analysis of randomized trials with missing outcomes (pdf)

Article PDF cannot be displayed. You can download it here:

https://academic.oup.com/biostatistics/article-pdf/4/4/495/682581/040495.pdf

Incorporating prior beliefs about selection bias into the analysis of randomized trials with missing outcomes

Biostatistics (2003), 4, 4, pp. 495–512 Printed in Great Britain DANIEL O. SCHARFSTEIN† Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA MICHAEL J. DANIELS Department of Statistics, University of Florida, Gainesville, FL 32611, USA JAMES M. ROBINS Department of Epidemiology and Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA S UMMARY In randomized studies with missing outcomes, non-identifiable assumptions are required to hold for valid data analysis. As a result, statisticians have been advocating the use of sensitivity analysis to evaluate the effect of varying asssumptions on study conclusions. While this approach may be useful in assessing the sensitivity of treatment comparisons to missing data assumptions, it may be dissatisfying to some researchers/decision makers because a single summary is not provided. In this paper, we present a fully Bayesian methodology that allows the investigator to draw a ‘single’ conclusion by formally incorporating prior beliefs about non-identifiable, yet interpretable, selection bias parameters. Our Bayesian model provides robustness to prior specification of the distributional form of the continuous outcomes. Keywords: Dirichlet process prior; Identifiability; MCHC; Non-parametric Bayes; Selection model; Sensitivity analysis. 1. I NTRODUCTION In randomized studies with missing outcomes, it is well known that non-identifiable assumptions (e.g. missing at random; Rubin, 1976) are required to hold for valid data analysis. The degree to which these untestable assumptions are believed can have a substantial impact on study conclusions. With this in mind, statisticians have been advocating the use of sensitivity analysis to evaluate the effect of varying assumptions on study conclusions. For example, Rotnitzky et al. (1998, 2001), Scharfstein et al. (1999), Robins et al. (2000) adopted a selection modeling approach; while Rubin (1977), Little (1994) and Daniels and Hogan (2000) used a pattern-mixture formulation. These approaches rely heavily on expert opinions about plausible ranges for non-identifiable, yet interpretable, sensitivity analysis parameters. While the above methodological developments are useful in assessing the sensitivity of treatment comparisons to missing data assumptions, it may be dissatisfying to some researchers/decision makers † To whom correspondence should be addressed c Oxford University Press; all rights reserved. Biostatistics 4(4) Incorporating prior beliefs about selection bias into the analysis of randomized trials with missing outcomes 496 D. O. S CHARFSTEIN ET AL. 2. ACTG 175 ACTG 175 was a randomized, double-blind trial designed to evaluate nucleoside monotherapy versus combination therapy in HIV-infected individuals with CD4 counts between 200 and 500 mm−3 . 2467 subjects were randomized to one of four treatment arms: 619 to AZT (600 mg a day) alone, 613 to AZT (600 mg a day) + ddI (400 mg a day), 615 to AZT (600 mg a day) + ddC (2.25 mg a day), and 620 to ddI (400 mg a day) alone (Hammer et al., 1996). CD4 counts were scheduled to be collected at baseline, week 8, and then every 12 weeks thereafter. Additional baseline characteristics were also collected. In the interest of space, we focus attention on the AZT+ddI and ddI treatment arms. Also, we ignore all recorded information except the CD4 count to be measured at week 56. One goal of the investigators was to compare the treatment-specific distributions of CD4 cell count at week 56 had all subjects remained on their assigned treatment through that week. Thus, it is useful to define a completer as a subject who stays on therapy and is measured at week 56; otherwise, we define the subject as a drop-out. In this paper, we do not distinguish between the multiple causes of drop-out. The percentage of drop-outs in the AZT+ddI and ddI arms is 33.6% and 26.5%, respectively. To address the above objective, a completers-only analysis is usually performed. The mean CD4 count at week 56 for completers (standard error) is 384.96 (8.53) and 359.59 (7.67) in the AZT+ddI and ddI arms, respectively. The difference in means is 25.36 and the associated 95% confidence interval is (2.87, 47.85); a test of the null hypothesis of no treatment difference has an associated p-value of 0.027, taken to be evidence of the superiority of AZT+ddI over ddI. The above estimates of the means, under full completion, are only valid if the completers and drop-outs are similar on measured (ignored) and unmeasured characteristics (i.e. missing at random). This latter, non-identifiable assumption is unlikely to hold, as it is well known from other studies that drop-outs tend to be very different than completers. Our goal is to present two alternative and complementary analysis strategies for the ACTG 175 data. The first approach is frequentist, while the second is Bayesian. because a single summary is not provided. A fully Bayesian analysis allows the investigator to draw a ‘single’ conclusion by formally incorporating prior beliefs about model parameters. For categorical outcomes, Robins et al. (1999) and Raab and Donnelly (1999) developed fully Bayesian selection modeling approaches, while Forster and Smith (1998) developed a pattern-mixture approach. For continuous outcomes, Lee and Berger (2001), building on the work of Bayarri and Degroot (1987) and Bayarri and Berger (1998), developed a semiparametric Bayesian selection modeling approach, which places strong distributional assumptions on the outcome and weak distributional assumptions on the selection mechanism. In this paper, we consider the continuous outcome setting but take an opposite tack from Lee and Berger (2001). That is, we place strong prior restrictions on the selection mechanism, but relax the distributional restrictions on the outcome. Our tack is motivated by the fact that, in the clinical trial setting, investigators may have firmer beliefs about the selection mechanism as opposed to the distributional form of the outcome. The flexibility we seek makes the problem challenging. As a result, we restrict ourselves to the setting in which additional covariate information is ignored. By closely examining this scenario, we will gain insight into the more difficult and realistic setting, in which covariate information is utilized. This latter setting will be addressed in a sequel. The paper is organized as follows. In Section 2, we describe an AIDS clinical trial which will provide context for the methods discussed throughout. In Section 3, we formalize the data structure of the AIDS study. In Section 4, we review the frequentist, non-parametric sensitivity analysis approach of Rotnitzky and colleagues. This review provides a backdrop for our flexible Bayesian approach, developed in Section 5. In Section 6, we analyze the AIDS data from both the frequentist and Bayesian perspective and compare results. Section 7 is devoted to a disc (...truncated)