meth stuff: How to probe simple and complex slopes in a regression analysis with three (or more) predictors using SPSS

(...or any other statistical software package, but the syntax examples here are for SPSS)

The procedure described here is the generalization of the procedure for estimating simple slopes for two predictors recommended in
Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Thousand Oaks: Sage.

Let Xraw, Wraw, and Zraw be three predictors/independent variables and Y the dependent variable. In all that follows, it is assumed that X, W, and Z are already centered, i.e.,

if the predictor is continuous/approximately continuous, the mean of the raw values across the entire sample has been subtracted from the raw values for new variables. E. g. for variable Xraw:
```
COMPUTE help_can_be_deleted_later = 1.
EXEC.
AGGREGATE /BREAK help_can_be_deleted_later /temp_mean= MEAN(Xraw).
EXEC.
COMPUTE X = Xraw - temp_mean.
EXEC.
DELETE VARIABLES help_can_be_deleted_later temp_mean.
EXEC.
```
This weird and complicated looking code exempts you from manually finding the mean of Xraw and hard-coding it into your syntax (in reality, it is not that bad). This way, you need not to worry later whether the mean of Xraw must be updated in your syntax in case you add another portion of observations to the data set, or realized that some of the cases were, for example, test cases that really should not be analyzed. Then this piece of syntax will always recalculate your mean so that the mean deviated (i.e., centered) score of X is always correct.
if the predictor is dichotomous, it is recoded into a variable with mean 0, e.g. by effect coding preferably [-.5, +.5], but [-1,+1] is also good. It is not recommended to dummy code the variable [0,1], even though it will not yield to incorrect results, but imposes some additional care in interpreting the simple slopes. E. g., Wraw is dichotomous and coded [1,2]:
```
RECODE Wraw (1=-.5)(2=.5)(ELSE=SYSMIS) INTO W.
```

There are now three basic regression model types:

The 'overall model' that gives average effects of all variables and interactions, but most important presently: the test of whether a three-way interaction is present among the predictors. If this three-way interaction is not present, then examining simple slopes is only sensible for exploratory purposes. In contrast, an argument that the two-way interactions of two of the predictors differ conditional on the thirs of the predictors, or any argument involving a claim that some simple slopes are different (because they appear so numerically) is mute and not permissible.
The 'complex slopes' - regression models that probe the two-way interaction of two predictors at a specific value of the third predictor. These models may become large in number quickly, for example, for X, W, and Z you might want to examine
- the X*W interaction at relatively low values of Z (Z = -1SD)
- the X*W two-way interaction at relatively high values of Z (Z = +1SD)
- the X*W two-way interaction at any value of Z you fancy
- the X*Z interaction at relatively low values of W (W = -.5)
- the X*Z two-way interaction at relatively high values of W (W = +.5)
- the X*Z two-way interaction at any value of W you fancy
- the W*Z interaction at relatively low values of X (X = -1SD)
- the W*Z two-way interaction at relatively high values of X (X = +1SD)
- the W*Z two-way interaction at any value of X you fancy
Which of those you examine is a choice that must be made on the grounds of your hypothesis (if you predict that X*W will be different depending on Z, then don't test X*Z dependening on W) and theoretical sense.
The simple slopes - the slopes of one predictor at specific values of the other two predictors. E. g.
- slope of X predicting Y when W = +1 and Z = -1
- slope of X predicting Y when W = +1 and Z = +1
- slope of X predicting Y when W = -1 and Z = -1
- slope of X predicting Y when W = -1 and Z = +1
- slope of X predicting Y when W = 0 (an average value of W in your sample) and Z = -1
- slope of X predicting Y when W = 0 (an average value of W in your sample) and Z = +1
- slope of X predicting Y when W is some value that for some reason has theoretical significance (e.g., cut-off point on a standardized and normed clinical diagnosis checklist) and when Z is some value that for some reason has theoretical significance (e.g., age = 8 years in a sample where the mean age is 12 years and the standard deviation is 1.5 years - you could be interested in just kids who are 8 years old)
- slope of W predicting Y when X is some value that for some reason has theoretical significance and when Z is some value that for some reason has theoretical significance
- slope of Z predicting Y when X is some value that for some reason has theoretical significance and when W is some value that for some reason has theoretical significance
- ...[insert any set of values of the other predictors you wish]...

Overall analysis

Regress Y on all predictors and all possible interaction terms:

COMPUTE XW = X*W.
COMPUTE XZ = X*Z.
COMPUTE WZ = W*Z.
COMPUTE XWZ = X*W*Z.
REG /DEP Y /MET=ENT X W Z XW XZ WZ XWZ.

	Estimate	Std. Error	t value	Pr(&gt \|t\|)
Intercept/Constant	-1.7007	2.5634	-0.66	0.5083
X	0.7536	0.2307	3.27	0.0014
W	2.3373	3.8422	0.61	0.5441
Z	0.6618	0.5712	1.16	0.2489
XW (X * W Interaction)	-0.2055	0.3284	-0.63	0.5326
XZ (X * Z Interaction)	-0.0450	0.0527	-0.85	0.3945
WZ (W * Z Interaction)	-0.5643	0.8752	-0.64	0.5203
XWZ (W * W * Z Interaction)	0.7630	0.0752	10.14	0.0000

The Intercept/Constant is the weighted mean of Y for cases with X=0, W=0, Z=0. If all predictors have been centered as described above, the coefficient for the Intercept/Constant is the weighted grand mean for the entire sample.
The regression coefficients of X, W, and Z as single predictors indicates the effect of these predictors above and beyond any other predictions that are achieved in the model (i.e.: predictions of the other single predictors and all interaction terms).
In case there are significant interaction terms involving a specific predictor, the single predictor coefficient for this specific predictor is a conditional effect and should - under NO CIRCUMSTANCES - be called 'main effect' - it is not a 'main effect' and therefore should not be called this way. In the example: Here is where the centering of variables comes in handy: If all predictors were centered beforehand, then these conditional single predictor effects are average effects: the effects of the variable with which they are associated if all other preditors take on value 0, and because of the centering, this value 0 of the other predictors indicates their average.
If no interaction effects whatsoever emerge in this analysis, then these single predictor effects can be referred to as main effects: they simply are present everywhere in the design, and the value of the other predictors is of no import to them.

The regression coefficients for the two-way interaction terms can - similarly - only be interpreted as average two-way interaction effects (i.e. interaction effects at the value 0 of the third predictor), if a three-way interaction is present. If no three-way interaction emerges (see below), then these two-way interactions can be interpreted as interactions that are present across the entire sample, the two-way interaction is of the same magnitude (possible 0 magnitude, though!), no matter what value the third variable takes.

If the three-way interaction term is significant, then any two-way interaction effects found in the previous step are really just the interactions at the value 0 of the third predictor. In this case it is recommended to estimate the 'complex slopes', i.e. conditional two-way interactions (see next step).

Complex slopes

Identify the predictor for whose specific value you want to estimate the two-way interaction of the remaining predictors and then create a new variable that is 0 at this specific value:

If you want to estimate the X*W interaction at the value Z = +1SD above the mean of Z: Calculate the standard deviation of Z and subtract it from Z to obtain a new variable Z_hi:

COMPUTE help_can_be_deleted_later = 1.
EXEC.
AGGREGATE /BREAK help_can_be_deleted_later /temp_sd= SD(Zraw).
EXEC.
COMPUTE Z_hi = Z - temp_sd.
EXEC.
DELETE VARIABLES help_can_be_deleted_later temp_sd.
EXEC.

If you want to estimate the X*W interaction at the value Z = -1SD below the mean of Z: Calculate the standard deviation of Z and subtract it from Z to obtain a new variable Z_lo:

COMPUTE help_can_be_deleted_later = 1.
EXEC.
AGGREGATE /BREAK help_can_be_deleted_later /temp_sd= SD(Zraw).
EXEC.
COMPUTE Z_hi = Z + temp_sd.
EXEC.
DELETE VARIABLES help_can_be_deleted_later temp_sd.
EXEC.

If you want to estimate the X*W interaction at some specific value Z = <specific value> of Z: "Shift" the variable Z into a new variable Z_specific, so that Z_specific has the value 0 where Z has the value 0.
```
COMPUTE Z_specific = Z - <specific value>.
EXEC.
```
...and analogously if you wish to estimate the interaction X*Z at different values of W, or the interaction W*Z at different values of X...

: Any of the variables that are "shifted" in this way are computed for all observations in the data set. Do not do any median splits, filtering or anything of the sort - for this type of simple slope analysis, all cases are used in all estimations.

that if you want the new variable to have 0 for cases that have a specific value on the original variable, you have to subtract the specific value. Therefore, if you want the new value to have 0 where the original variable had +1SD, you have to take

COMPUTE Z_hi = Z - sd.

and if you want Z_hi to have value 0 where the original variable has -1SD, you have to

COMPUTE Z_lo = Z - (-sd) = Z + sd.

Examples of "shifting variables"

original continuous variable	-3	-2	-1	0	1	2	3
	+2	+2	+2	+2	+2	+2	+2
variable to compute a slope at original variable = -2	-1	0	1	2	3	4	5
add 2 in order to get a variable with 0 at the 'old' value minus 2

original continuous variable	-3	-2	-1	0	1	2	3
	-1	-1	-1	-1	-1	-1	-1
variable to compute a slope at original variable = +1	-2	-1	0	1	2	3	4
subtract 1 in order to get a variable with 0 at the 'old' value plus 1

original dichotomous variable	-1	1
	-1	-1
variable to compute a slope in the condition that is coded as +1 in the original variable	-2	0
subtract 1 in order to get a variable with 0 at the 'old' value plus 1

original dichotomous variable	-1	1
	+1	+1
variable to compute a slope in the condition that is coded as -1 in the original variable	0	2
add 1 in order to get a variable with 0 at the 'old' value minus 1

original dichotomous variable	-.5	.5
	-.5	-.5
variable to compute a slope in the condition that is coded as -.5 in the original variable	-1	0
subtract .5 in order to get a variable with 0 at the 'old' value plus .5

original dichotomous variable	-.5	.5
	+.5	+.5
variable to compute a slope in the condition that is coded as +.5 in the original variable	0	1
add .5 in order to get a variable with 0 at the 'old' value minus .5

With these shifted variables, you can now ask SPSS to produce the complex slope, i.e. the two-way interaction of two of the predictors at a specific value of the shifted predictor. For example, you want to know how X and W interact within those observations that are in the group designated by Z = -1 vs. in the group that is designated by Z = +1 [Z here being assumed to be dichotomous]: "shift" Z into Z_lo so that Z_lo has the value 0 where Z has the value -1, i.e. add one to Z in order to obtain Z_lo. Then, update all those interaction terms from the overall model that contain Z, so that they are now calculated based on Z_lo instead of Z:

COMPUTE Z_lo = Z + 1.     /*can be reused from earlier*/
COMPUTE XW = X*W.     /*can be reused from earlier*/
COMPUTE XZ_lo = X*Z_lo.
COMPUTE WZ_lo = W*Z_lo. 
COMPUTE XWZ_lo = X*W*Z_lo.
REG /DEP Y /MET=ENT X W Z_lo XW XZ_lo WZ_lo XWZ_lo.

	Estimate	Std. Error	t value	Pr(&gt \|t\|)
Intercept/Constant	xx	xx	xx	xx
X	xx	xx	xx	xx
W	xx	xx	xx	xx
Z_lo	xx	xx	xx	xx
XW (X * W Interaction)	xx	xx	xx	xx
X * Z_lo	xx	xx	xx	xx
W * Z_lo	xx	xx	xx	xx
X * W * Z_lo	xx	xx	xx	xx

The term XW (specially marked in the above result output) represents the X × W interaction at Z = -1: Its coefficient has a t-value associated with it as well as a p value, so you can report its size and whether it is sufficiently remote from a zero coefficient for your to confidently decide that it is an effect.

You can ignore coefficients for any other terms except maybe the Intercept/Constant: it represents the weighted mean of Y for Z = +1SD. It may come handy during your creating plots.

Proceed analogously in order to test the X×W interaction for cases that have Z = +1. "Shift" Z into Z_hi by subtraction 1, update the interaction terms involving Z and read off the coefficient for the XW term the model.

COMPUTE Z_hi = Z - 1.     /*can be reused from earlier*/
COMPUTE XW = X*W.     /*can be reused from earlier*/
COMPUTE XZ_hi = X*Z_hi.
COMPUTE WZ_hi = W*Z_hi.
COMPUTE XWZ_hi = X*W*Z_hi.
REG /DEP Y /MET=ENT X W Z_hi XW XZ_hi WZ_hi XWZ_hi.

	Estimate	Std. Error	t value	Pr(&gt \|t\|)
Intercept/Constant	xx	xx	xx	xx
X	xx	xx	xx	xx
W	xx	xx	xx	xx
Z_hi	xx	xx	xx	xx
XW (X * W Interaction)	xx	xx	xx	xx
X * Z_hi	xx	xx	xx	xx
W * Z_hi	xx	xx	xx	xx
X * W * Z_hi	xx	xx	xx	xx

Of course the same works for a continuous variable Z, you can pick any value of Z = <Zvalue> you want the X×W interaction for, calculate a new variable Z' by subtracting <Zvalue> from Z (or reuse it if you have computed it before), update all interaction terms that involve Z so that they are based on Z' (or reuse them if previously computed) and run the model with the new variable and the updated interaction terms.

The different complex slopes may be of a variety of patterns:

The W × X @ Z = -1 interaction might be significant while the W × X @ Z = +1 interaction may not.
The W × X @ Z = -1 interaction might be not significant while the W × X @ Z = +1 interaction may be significant.
The W × X @ Z = -1 interaction might be significant and the W × X @ Z = +1 interaction may also be significant, but they are in different "directions", i.e. the pattern may be different.
The W × X @ Z = -1 interaction might be significant and the W × X @ Z = +1 interaction may also be significant, but the pattern is more pronounced in the first than the latter (or vice versa).
The W × X @ Z = -1 interaction might be not significant and the W × X @ Z = +1 interaction may also be not significant, but the patterns are quite opposite.

In either case: p<.05 for the three-way interaction from the overall analysis guarantees that the X × W interactions are different, but it does not say whether and which of the two will be significant and which not.

The same goes for any of the other two-way interactions at different values of the respective third predictor: These two-way interactions will be different from one another (not necessarily both from 0!) depending on the third predictor if the three-way interaction above was significant. You can assess each two-way interaction at specific values of the third predictor by shifting the third püredictor, recalculating the interaction terms and reading of the coefficient/t-value/p-value of the two-way interaction in focus.

Simple slopes

Once you have conducted the complex slopes analyses above, you probably want to know the exact pattern of the conditional two-way interactions that you obtained (e.g., the different X×W interactions at Z = +1 vs. Z = -1). Now you are back at simple slopes calculation as it is done for models with only two predictors, but you do not only shift one predictor (the "other one" in the two predictor case), but this time you shift two of the three predictors and observe the slope of the third predictor for the values of the other two that you "shifted" them to. Let's say you want the slope of X if W = +1SD (W being continuous) und Z = -1 (Z is dichotomous here): Shift W to W_hi so that W_hi has a zero where W used to have a value equal to +1SD of W - i.e., subtract 1 SD of W from W -, and shift Z to Z_lo, so that Z_lo has a value of zero where Z is -1 - i.e., add 1 to Z to obtain Z_lo (or reuse the variable if you have computed it before, e.g., for the complex slopes).
Then, update all interaction terms that involve W and Z so that they are now based on W_hi and Z_lo and run the entire model again with the new terms.

COMPUTE W_hi = W - 1.
COMPUTE Z_lo = Z + 1.     /*can be reused from earlier*/
COMPUTE XW_hi = X*W_lo.
COMPUTE W_hiZ_lo = W_lo*Z.
COMPUTE XZ_lo = X*Z_lo.
COMPUTE XW_hiZ_lo = X*W_lo*Z.
REG /DEP Y /MET=ENT X W_hi Z_lo XW_hi XZ_lo W_hiZ_lo XW_hiZ_lo.

	Estimate	Std. Error	t value	Pr(&gt \|t\|)
Intercept/Constant	xx	xx	xx	xx
X	xx	xx	xx	xx
W_hi	xx	xx	xx	xx
Z_lo	xx	xx	xx	xx
X * W_hi	xx	xx	xx	xx
X * Z_lo	xx	xx	xx	xx
W_hi * Z_lo	xx	xx	xx	xx
X * W_hi * Z_lo	xx	xx	xx	xx

The most interesting thing from this analysis is the coefficient associated with X: It is the simple slope of X in predicting Y for cases that are 1 SD above the mean on W and in the group designated by Z = -1.

You can now repeat this type of analysis for any combinations of W and Z and generate the simple slope of X in this fashion., but also the simple slopes of Z at any combination of W and X, and the simple slopes of W at any combination of X and Z. Just shift all the predictors you want as background conditions for the simple slope of a specific predictor, update the interaction terms to reflect the shifting and read off the coefficient along with its t and p values for the specific predictor.

You may also be interested in the intercept from this analysis (marked in light blue), especially if you are planning to create slope diagrams with unstandardized coefficients. This coefficient represents the weighted mean of Y for cases that are 1 SD above the mean on W and in the group designated by Z = -1, averaged over X.

Any other coefficients here are of limited interest.

You can generalize this procedure to any number of predictors, and you can shift any number of these predictors to suit you needs. If you want to know the mean of Y and the slope of a püarticular variable at a specific combination of values of the other predictors, then shift all other predictors to that value, recalculate all interaction terms involving the shifted predictors, run the analysis and read off the intercept and slope of the variable of interest. With respect to all unshifted predictors, these means and slopes are average values, provided that you have centered all these unshifted predictors. The intercepts and slopes are means and slopes at the specific values of the shifted variables and at the values 0 (=means, if centered) of the unshifted predictors.

This might get messy, especially if you want simple slopes for many different combinations of variables. But if you do this type of analysis a few times and keep your syntax organized, you will get the hang of it pretty quickly. Also, once you have done this a few times for two or three predictors, you can also do it for 17 predictors. In principle at least, because oversight over the different simple slopes combinations will be lost by then and no one will be able to make the least bit of substantive sense out of the slopes - but you can compute them!

meth stuff

20130918

How to probe simple and complex slopes in a regression analysis with three (or more) predictors using SPSS

Overall analysis

Complex slopes

Examples of "shifting variables"

Simple slopes

No comments:

About Me

20130918

How to probe simple and complex slopes in a regression analysis with three (or more) predictors using SPSS

Overall analysis

Complex slopes

Examples of "shifting variables"

Simple slopes

No comments:

Subscribe To