Interaction as Departure from Additivity in Case-Control Studies: A Cautionary Note
Anders Skrondal
)
0
0
From the Division of Epidemiology, Norwegian Institute of Public Health
,
Oslo
,
Norway
It has been argued that assessment of interaction should be based on departures from additive rates or risks. The corresponding fundamental interaction parameter cannot generally be estimated from case-control studies. Thus, surrogate measures of interaction based on relative risks from logistic models have been proposed, such as the relative excess risk due to interaction (RERI), the attributable proportion due to interaction (AP), and the synergy index (S). In practice, it is usually necessary to include covariates such as age and gender to control for confounding. The author uncovers two problems associated with surrogate interaction measures in this case: First, RERI and AP vary across strata defined by the covariates, whereas the fundamental interaction parameter is unvarying. S does not vary across strata, which suggests that it is the measure of choice. Second, a misspecification problem implies that measures based on logistic regression only approximate the true measures. This problem can be rectified by using a linear odds model, which also enables investigators to test whether the fundamental interaction parameter is zero. A simulation study reveals that coverage is much improved by using the linear odds model, but bias may be a concern regardless of whether logistic regression or the linear odds model is used. additivity; case-control studies; epidemiologic methods; interaction Abbreviations: AP, attributable proportion due to interaction; RERI, relative excess risk due to interaction; S, synergy index.
-
Logistic regression analysis is the workhorse of
contemporary epidemiology. Consequently, assessment of interaction
is often performed by simply introducing product terms into
logistic risk models. This practice has been vehemently
criticized by some epidemiologists, who argue that assessment
of interaction should mainly be based on additive rate or risk
models (17). For rare outcomes, this notion of interaction
follows from probabilistic independence, as embodied in the
classical toxicologic notion of simple independent action
discussed by Finney (8). The purpose of this article is not to
engage in the debate on how interaction should be
conceptualized in epidemiology. Rather, I confine my investigation to
the performance of suggested measures of interaction as
departure from additivity.
In cohort studies, the desired interaction assessment can
easily be accomplished by fitting linear rate or risk models.
However, the parameters of linear models cannot be validly
estimated for case-control studies unless the sampling
fractions for cases and controls are known or can be estimated.
On the other hand, it is well known that odds ratios can be
estimated in case-control studies. Furthermore, relative risks
are often well approximated by odds ratios in case-control
studies.
On the basis of these observations, Rothman (1, 2)
suggested a synergy index (S) which can be used in
casecontrol studies to measure interaction as departure from
additive risks. Moreover, Rothman considered statistical
inference for the index, deriving confidence intervals using
the delta method. Rothman presented several additional
measures of interaction (3), including the relative excess risk
due to interaction (RERI), renamed the ICR by Rothman and
Greenland (6), and the attributable proportion due to
interaction (AP), which is the focus in Rothmans latest book (7).
Rothman furthermore pointed out (3, p. 324) that estimates
of RERI, AP, and S are easily obtained from logistic
regression analysis, as are Wald tests and confidence intervals (9).
Alternatively, a likelihood ratio test of additive risks could
be performed in the logistic regression model. Although this
test would be expected to have better properties than the
Wald test, it would be much harder to implement.
Discussion of the measures advocated by Rothman is
typically confined to the somewhat unrealistic situation in which
there are two exposures but no additional covariates to
control for confounding. An exception is Flanders and
Rothman (10), who suggested a likelihood approach to
estimating S from stratified case-control data. As Rothman
acknowledged (3), their approach only handles one or
possibly two additional covariates, because otherwise data in
each stratum become too sparse. Hence, Rothman suggests
invoking multivariate methods in estimating RERI, AP,
and S when there are additional covariates. Specifically,
Rothman states, Confounding factors can be controlled by
including terms for those factors in the multiple logistic
model (3, p. 324). This suggestion has been adhered to by
epidemiologists (for instance, see Olsen et al. (11)).
There has been a paucity of studies investigating the
performance of RERI, AP, and S. The only paper I am
aware of is that of Assmann et al. (12), where the
investigation was limited to coverage of confidence intervals for
RERI and AP in models without additional covariates. The
primary concern in this article is the extent to which RERI,
AP, and S are useful summary measures of interaction as
departure from additive risks. In addition to the
conventional approach based on logistic regression, I also suggest
an alternative approach based on linear odds models.
Attention is focused on the more realistic setting in which
there are additional covariates. However, the concepts are
best introduced in a setting with two exposures and no
additional covariates.
MODELS FOR TWO EXPOSURES
Let Y be a dichotomous outcome variable with outcomes 1
and 0. Consider the case of two dichotomous exposure
variables x1 and x2 with levels j = 0, 1 and k = 0, 1, respectively.
Let
Let Rjk P(Y = 1|xl, x2) be the conditional risk or
probability that the outcome variable Y takes the value 1 given
the values of the exposures. For all j and k, define risk
differences as RDjk Rjk R00, relative risks as RRjk Rjk/
R00, odds as Ojk Rjk/(l Rjk), and odds ratios as ORjk
Ojk/O00.
The linear risk model
A linear risk model is now specified as
Rjk = a + b1x1 + b2x2 +b3x1x2,
where it is assumed that a > 0, b1 > 0, and b2 > 0. It follows
that a = R00, b1 = R10 R00 = RD10, and b2 = R0l R00 =
RD0l. Hence, a is interpreted as the risk when there is no
exposure, b1 as the excess risk under exposure x1
(compared with no exposure whatsoever), and b2 as the
excess risk under exposure x2. The parameter b3 can be
expressed as
representing the excess risk due to interaction of the
exposures. If b3 = 0, RD11 = RD01 + RD10, which is
riskdifference additivity. According to Rothman (3, p. 320),
b3 is the most fundamental epidemiologic measure of
interaction.
Unfortunately, the linear risk model cannot in general be
validly estimated from case-control designs, unless the
sampling fraction of cases and controls is known or can be
estimated. Since this rarely appears to be the case, it follows
that direct inferenc (...truncated)