20140219

My interaction is not significant, but the simple slopes are...

Often times, people are confused by a statistical interaction pattern they find - or rather: by the lack of it. Let's say you have two potentially explaining variables X1 and X2 and a dependent variable Y. You estimate a regression model with X1, X2 and their product X1*X2 as predictors and Y as the criterion. You find that the p value associated with the interaction term is far from the 'holy hurdle' of p = .05: not significant by a long shot. So the decision is clear: I cannot claim that an interaction of the size found is far enough from zero to assertively claim an interaction effect (at least not with a type I error of .05 and a reasonable type II error = 1 - β). So, it is more reasonable to consider X1 and X2 having individual, additive effects on Y - in other words: X1 and X2 only have true main effects on Y.
Let's have an example (The R code for this example is available at https://raw.githubusercontent.com/johannjacoby/interaction_and_slopes/master/interaction%20&%20slopes.R.

#get example data with predictors X1 and X2 and a dependent variable Y ds <- as.data.frame( read.table( "https://raw.github.com/johannjacoby/interaction_and_slopes/master/no_interaction_differing_slopes.dat", header=T, sep="\t")) #center both predictors ds$centered.X1 <- scale(ds$X1, center=T, scale=F) ds$centered.X2 <- scale(ds$X2, center=T, scale=F) #shift X2 to obtain the simple slope of X1 @ 1SD below the mean of X2 ds$centered.X2.lo <- ds$centered.X2 + sd(ds$X2) #shift X2 to obtain the simple slope of X1 @ 1SD above the mean of X2 ds$centered.X2.hi <- ds$centered.X2 - sd(ds$X2) #### yes, it is correct, to get ds$centered.X2.hi you have to subtract 1 SD, #### and in order to get ds$centered.X2.lo you have to add 1 SD. # now estimate the basic regression model to see whether X1 and X2 interact: model0 <- lm(Y~centered.X1*centered.X2, ds) # and the two regression models in order to obtain the simple slopes of X1 at X2=1SD below the mean and at X2=1SD above the mean: model.lo <- lm(Y~centered.X1*centered.X2.lo, ds) model.hi <- lm(Y~centered.X1*centered.X2.hi, ds) #show the models estimates summary(model0); summary(model.lo); summary(model.hi) # print the interaction and the simple slopes of X1 @ X2=-1SD and X2=+1SD results0 <- summary(model0)[[4]] results.lo <- summary(model.lo)[[4]] results.hi <- summary(model.hi)[[4]] cat("\n", "Interaction X1 * X2: b=",results0[4],", t=",results0[12],", p=", sprintf("%5.4f",results0[16]),ifelse(results0[16] < .05," *",""),"\n", "Slope of X1 @ X2 = -1SD: b=",results.lo[2],", t=",results.lo[10],", p=", sprintf("%5.4f",results.lo[14]),ifelse(results.lo[14] < .05," *",""),"\n", "Slope of X1 @ X2 = +1SD: b=",results.hi[2],", t=",results.hi[10],", p=", sprintf("%5.4f",results.hi[14]),ifelse(results.hi[14] < .05," *",""),"\n", "absolute diff(p) = |",results.hi[14]," - ",results.lo[14],"| = ", abs(results.hi[14] - results.lo[14]),"\n", "diff(b) = ",results.hi[2]," - ",results.lo[2]," = ",results.hi[2] - results.lo[2], "\n", sep="")

These are the results:

Interaction X1 * X2: b=-0.5400003, t=-1.062382, p=0.2921 Slope of X1 @ X2 = -1SD: b=1.33277, t=2.525593, p=0.0140 * Slope of X1 @ X2 = +1SD: b=0.3723435, t=0.5394312, p=0.5915 absolute diff(p) = |0.5914612 - 0.0140371| = 0.5774241 diff(b) = 0.3723435 - 1.33277 = -0.9604268

Clearly, the results indicate that the interaction term is not significant and small, so that it could easily be explained by random fluctuations (around a true zero "interaction" in the results). This essentially means: There is not interaction of X1 and X2, the slopes of one of the predictors at different values of the respective other do not systematically differ as a function of this other predictor. But Slope of X1 @ X2 = +1SD (from now on: "high slope") is not significant with a p value = .5915 and Slope of X1 @ X2 = -1SD (from now on: "low slope") is significant with p = .0140! So it appears that the low slope and the high slope are different after all - the former is not significant, the latter is significant. So we might be tempted to ignore the non-significant interaction and simply claim that we found differential effects of X1 on Y, conditional on the value of X2. But I argue that the implicit reasoning behind this is fundamentally flawed and will elaborate on this argument.

Notice how above, we chose +1SD and -1SD as the conditional values of X2 at which we test the slope of X1. But these values of X2 are rather arbitrary. We could just as well consider the slope of X1 @ X2 = mean of X2 (from now on: "mean slope") vs. the low slope. We have the latter from above:

Slope of X1 @ X2 = -1SD: b=1.33277, t=2.525593, p=0.0140 *

and we obtain the mean slope from the original basic regression model that we used to test the interaction in the first place:

# simple slope of X1 @ X2 = mean cat("Slope of X1 @ X2 = mean of X2: b=",results0[2],", t=",results0[10],", p=",sprintf("%5.4f",results0[14]), ifelse(results0[14] < .05," *",""),"\n")

yielding the result:

Slope of X1 @ X2 = mean of X2: b= 0.8525569 , t= 2.048904 , p= 0.0446 *

So, comparing the mean slope vs. the low slope, we see that they are both significant. So, according to the mere comparison of significance decisions, they are "the same" - the simple slopes do not differ. One might argue that one of the p values is smaller than the other, but p values and their differences are a very poor indicator of an actual difference as they are not a linear function of the actual effect size. The difference between p = .30 and p = .40 is not the same as the difference between p = .11 and p = .01, so gauging whether two effects differ by eyeballing two p values and making a guess as to whether they actually are "very different", "not so much, but still different", or "not really different at all" will not cut it as a reproducible and transparent decision rule for the scientific test of a theory. In addition, p values are by definition highly dependent on sample size. So a difference of the same magnitude between two slopes (i.e., a difference in bs) might look like a huge difference with N = 300 (because the p values are far apart), but with N = 120, the difference in p values might not look so big anymore. This is of course true for any p value from a statistical test, but the problem is exacerbated if you look at differences between effects.

So, if we compare the low and high slopes by a rather haphazard guess about the difference between p values (absolute diff(p) = .577 in the case of comparing the low and high slopes), we get the result "the slopes differ"; but if we compare the mean slope and the high slope, we get the result "they don't differ, they are both significant". One might argue that this is unfair, because the difference in X2 between the mean and the high slopes is much smaller than the difference in X2 between the low and the high slopes. That is unquestionably true, but who is to say which difference in X2 is the "right one"? Also, we could chose another set of values of X2 on which we condition the slope of X1 and obtain a similar result that the slopes are not really different (according to simple p value comparison). We can estimate the simple slopes at any values of X2 we wish to, so we could pick X2 = -2SD of X2 and X2 = mean of X2. These two conditional values are exactly as far apart as + and - 1SD above (i.e., 2SDs), so the comparison of these two simple slopes is just as "fair" toward a difference between the slopes as that of the high and low slopes above:

# simple slopes of X1 @ X2 = -2 SD and @ X2 = mean of X2 ds$centered.X2.minus2SD <- ds$centered.X2 + 2 *sd(ds$X2) model.lo.other <- lm(Y~centered.X1*centered.X2.minus2SD, ds) summary(model.lo.other) results.lo.other <- summary(model.lo.other)[[4]]; results.hi.other = summary(model.hi.other)[[4]] cat( "Slope of X1 @ X2 = -2SD: b=",results.lo.other[2],", t=",results.lo.other[10],", p=", sprintf("%5.4f",results.lo.other[14]),ifelse(results.lo.other[14] < .05," *",""),"\n", "Slope of X1 @ X2 = mean of X2: b=",results0[2],", t=",results0[10],", p=", sprintf("%5.4f",results0[14]),ifelse(results0[14] < .05," *",""),"\n", "abs.diff(p) = |",results.hi.other[14]," - ",results.lo.other[14],"| = ",abs(results.hi.other[14] - results.lo.other[14]), "\n", "diff(b) = ",results.hi.other[2]," - ",results.lo.other[2]," = ",results.hi.other[2] - results.lo.other[2], "\n", sep="")

The slope of X1 @ X2 = -2SD (i.e., the "-2 slope") and the slope of X1 @ X2 = mean of X2 (i.e., the "mean slope") are both significant:

Slope of X1 @ X2 = -2SD: b=1.812984, t=2.036622, p=0.0458 * Slope of X1 @ X2 = mean of X2: b=0.8525569, t=2.048904, p=0.0446 * abs.diff(p) = |0.04457648 - 0.04582954| = 0.001253059 diff(b) = 0.8525569 - 1.812984 = -0.9604268

And their p values only differ by a miniscule .0013. That surely is not a difference in simple slopes that should be taken serious.

Thus, even if we compare two simple slopes that are as far apart in X2 as the low and high slopes, now we have to come to the conclusion (if we use the crude p value comparison) that the slopes are not different, and both not significant. The comparison between the p values again is not a big help: this time the difference between p values of the slopes is diff(p) = .0013, while in the comparison between low and high slopes it was .577. How is one to determine whether these differences in p are the same or different? That's right, it will become a mess to continue with comparing p values by approximate visual inspection and making transparent scientific decisions based on these p value decisions.

Things are, however, different if we look at the difference in b, diff(b) in the two cases. In the comparison of the low and high slopes above, we obtained a difference of diff(b) = -0.96 and in the comparison -2 slope vs. mean slope we obtained the exact same difference: -0.96. It appears more sensible to use the difference in b as a representation of the difference in slopes - at least the difference in slopes thusly indicated is the same when the difference in the conditional values of X2 is the same. But how can we test if this difference in b is substantial, or allows us - based on a clear and transparent criterion - to say: "the two slopes at two different values of the moderator X2 are different"? After all we have no p value for the difference in b! Or do we? Yes, we do! Well, not a p value for the comparison between the low and high slopes or that comparison between the -2 and mean slopes, but we have one for any pair of simple slopes that are 1 unit of X2 apart: The p value for the interaction term in the very first basic regression analysis. The interaction term tests whether the difference of two simple slopes of X1 on Y that are estimated or calculated at two different conditional values of X2 that are exactly one unit of X2 apart is significant. So this does not compare the low and high simple slopes, but since the low and high simple slopes are exactly 2SD of X2 apart and SD is a stable multiple of the unit of X2, the test is good enough. Remember that after all the interaction term is linear - this means that the increase in the simple slope of X1 as X2 becomes larger (or the decrease as X2 becomes smaller, or even - if the interaction is negative: the increase in the simple slope of X1 as X2 becomes smaller and the decrease in the simple slope of X1 as X2 becomes larger) is the same no matter which two conditional values of X2 one might chose, as long as they are equally far apart, i.e., 1 unit.

In sum: the comparison of two p values in order to assess whether simple slopes are different is not to be recommended as it gives different comparison values for slopes of the same distance in X2 and p values are not very easy to compare. In contrast, the p value of the original interaction allows to make a decision for or against the hypothesis of "differing slopes" that will be the same for any pair of conditional X2 values that are the same distance apart. This p value of the interaction is a focused and concise basis for such a decision and does not require an additional decision about whether two p values are close, far apart or basically the same. The interaction term and its statistical properties is what the comparison of simple slopes should be based on, not the isolated slopes or pairs of them. If the interaction term is not significant - by the logic of significance testing - the slopes are not different. If the interaction term is significant, then any pair of slopes whose conditional values of the moderator are one unit apart, are different - no matter what their individual p values are.

There are two more general points that can be taken away from this:

  1. Comparing things
  2. In general, comparing two effects (or any other statistics) by just laying them side by side and approximately assessing a difference in p values is never a good idea. What counts is whether the difference between the two is significant, not the individual significance decisions for the two statistics to be compared. This general principle is very clear to anyone if we look at the comparison of two means. Suppose you want compare two groups regarding a dependent variable that ranges from -16 to +16. So you do a t-test within each group and test whether the mean of the dependent variable differs from zero:

    exampledata.group.means <- as.data.frame( read.table("https://raw.github.com/johannjacoby/interaction_and_slopes/master/group.mean.comparison.dat", header=T, sep="\t", quote="", stringsAsFactors=F)) #comparing group means individually against 0 test1 <- t.test(exampledata.group.means[which(exampledata.group.means$group==1),]$dv) test2 <- t.test(exampledata.group.means[which(exampledata.group.means$group==2),]$dv) cat( "Group 1: t = ",test1$statistic,", p = ", sprintf("%10.9f",test1$p.value),"\n", "Group 2: t = ",test2$statistic,", p = ", sprintf("%10.9f",test2$p.value),"\n", sep="")

    You get the following results for the two groups:

    Group 1: t = 0.02738598, p = 0.978270349 Group 2: t = 2.605806, p = 0.012717168

    So the group means are different, right? The group mean in Group 1 is not significantly different from zero, but the mean in Group 2 is nice and significant. And the graph confirms this - one of the means is essentially zero, the other on is significantly different from zero:

    library(gplots) dg <- barplot2( tapply(exampledata.group.means$dv, exampledata.group.means$group, mean), width=c(1,1), names.arg = c("Group 1", "Group 2"), xlim=c(0,3), ylim = c(min(c(test1$conf.int[1], test2$conf.int[1]))-1,max(c(test1$conf.int[2], test2$conf.int[2]))+1), plot.ci=TRUE, ci.l=c(test1$conf.int[1], test2$conf.int[1]), ci.u=c(test1$conf.int[2], test2$conf.int[2]), ci.width=.1 ) title(sub=expression(paste("Error bars denote 95% confidence intervals | * = significant at ", alpha, " < .05", sep="")), cex.sub=.7, adj=0) text( dg[1], test1$conf.int[2]+.5, paste("M = ",sprintf("%3.2f", test1$estimate), " ", ifelse(test1$p.value < .05,"*","n.s."), sep="")) text( dg[2],test2$conf.int[2]+.5, paste("M = ",sprintf("%3.2f", test2$estimate), " ", ifelse(test2$p.value < .05,"*","n.s."), sep=""))

    Right? Of course not. Nobody in their right mind would accept such a comparison of two group means by looking at the difference in p values. The correct way to compare two means is not looking at the difference of p values associated with the test of individual group means against zero, but testing the difference between the means:

    #comparing the difference between means against 0 test.both <- t.test(exampledata.group.means$dv, exampledata.group.means$group) cat( "Group comparison : t = ",test.both$statistic,", p = ", sprintf("%5.4f",test.both$p.value),"\n", sep="")

    This gives us a test of whether the difference between the means is different from zero:

    Group comparison [M(Group 1) - M(Group2) against zero]: t = -1.219916, p = 0.2257

    So, even though the group means appear to be different, based on their individual p values, they are not different: the difference between them is not significantly different from zero, and in the logic of significance testing the means should be considered not different. The best you can do to characterize the means is to take the same estimate for both: the grand mean. Any deviations from that grand mean can plausibly be attributed to chance.

    Of course nobody would proceed as described here (i.e., visually comparing p values for individual group means and drawing conclusions regarding the difference between them). But to the same degree that this strategy appears nonsensical for two group means, it also should appear nonsensical for the comparison of two slopes, two interactions, two structural equation models or the comparison of a total and direct effect in mediation analysis:

    • If you want to compare two means, test their difference against zero (instead of testing each individually against zero and then visually inspecting a difference between the two results). For group mean comparison, this test of the difference is achieved by an independent sample t-test.
    • The same goes for simple slopes: Test their difference against zero (instead testing each individually against zero and then visually inspecting a difference between the two results). This test of the difference between slopes is achieved by the statistical test of the interaction effect.
    • And it goes on: if you want to know whether two two-way interactions differ, do not conduct two two-way analyses and then visually compare their differences - instead, look at the three-way interaction - it tells you whether the two-way interactions differ.
    • If you want to compare to structural equation models, do not estimate each of them and then visually inspect the individual RMSEA values (or any other fit indices) - instead, test the difference in fit between the models, by testing the difference in χ².
    • If you want to compare two correlations between X and Y within different groups of Z, do not compute each correlation within groups of Z and then visually inspect differences between the two correlations - rather, enter X, Z, and their product interaction term as predictors into a regression model predicting Y, then the interaction will tell you whether the association between X and Y differs depending on Z.
    • If you conduct a mediation analysis based on recommendation formulated most prominently in Baron, R. M., & Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51(6), 1173-1182., you might be tempted to engage in comparing the c and c' paths (the total effect and the direct effect remaining after statistically controling M in the prediction of Y, respectively). It is still widespread to draw substantial conclusions from the comparison of these two paths regarding "complete" or "partial" mediation. In this logic, mediation is "complete" if the total effect is significant, but the direct effect is not. Imagine a case where c is associated with a p value of p=.04 (significant) and c' has a p value of p=.06 (not significant). According to the "complete"/"partial" mediation logic such a difference would have to qualify as "complete". On the other hand, a case where the total effect has p=.06 (not significant) and a positive sign, and the direct effect has a negative sign and also p=.06 (so is also not significant but by far in the opposite direction), even though the indirect effect is much, much larger than in the former example, one would have to conclude "no indirect effect/no mediation" here. And finally, the total effect may be very very large with a miniscule p value, and the direct effect could be also significant, with a larger p value (but still smaller than .05), with a huge difference between these two effects, but you would have to conclude "only" "partial" mediation. One reason for this mess is that the distinction between "partial" and "complete" mediation relies on comparing individual p values instead of simply basing decisions on a test of the difference. This difference, in normal cases with continuous variables, is exactly the indirect effect, i.e., a ×b and can be tested in a simple elegant fashion without the visual inspection of p values that very poorly represent effect sizes and differences between effects. More on this in Hayes, A. F. (2009). Beyond Baron and Kenny: Statistical mediation analysis in the new millennium. Communication Monographs, 76, 408-420. and Rucker, D. D., Preacher, K. J., Tormala, Z. L., & Petty, R. E. (2011). Mediation analysis in social psychology: Current practices and new recommendations. Social and Personality Psychology Compass, 5(6), 359-371.

  3. Other combination of interaction and slopes significance decisions
  4. Of course, it goes the other way around: you might find yourself in a situation where the X1×X2 interaction term is significant, but the simple slopes you compute seem to not differ - they are both significant, e.g. one has a p value of p = .03 and the other one of p = .001. This case is just the other side of the coin of the issue discussed above. Of course the simple slopes can differ between two conditional values of X2 that are one unit apart, and they can both be significant, but one of them is larger than the other. Remember that the p value does not give you a good idea of how large an effect is, it essentially tells you whether you should be weary that an effect of a given size that you found might be plausibly explained as a random deviation from an actual null effect. So again, if you want to know the difference between effects, you need to look at their difference and test that difference rather than 'looking' at whether the difference between p values gives you a tingly gut feeling. And if the interaction is significant, two individually very clearly significant simple slopes at X2 = mean of X2 and X2 = mean of X2 + 1 will also be statistically different. They are both so far from zero that you may deem it implausible for any of them to come from a true null distribution, but one of them just may be larger than the other.

    And finally you may come across cases where none of the simple slopes you elect to compute may be significant, but the interaction is significant. This may of course be due to your choosing to examine slopes at conditional values of X2 that are quite close together, but also in cases where the conditional values of X2 (for the simple slopes of X1) are reasonably far apart. In an example:

    example2 <- as.data.frame( read.table( "https://raw.github.com/johannjacoby/interaction_and_slopes/master/interaction_insignificant_slopes.dat", header=T, sep="\t")) example2$centered.X1 <- scale(example2$X1, center=T, scale=F) example2$centered.X2 <- scale(example2$X2, center=T, scale=F) example2$centered.X2.lo <- example2$centered.X2 + sd(example2$X2) example2$centered.X2.hi <- example2$centered.X2 - sd(example2$X2) model0.2 <- lm(Y~centered.X1*centered.X2, example2) model.lo.2 <- lm(Y~centered.X1*centered.X2.lo, example2) model.hi.2 <- lm(Y~centered.X1*centered.X2.hi, example2) results0.2 <- summary(model0.2)[[4]] results.lo.2 <- summary(model.lo.2)[[4]] results.hi.2 <- summary(model.hi.2)[[4]] cat("\n", "Interaction X1 * X2: b=",results0.2[4],", t=",results0.2[12],", p=", sprintf("%5.4f",results0.2[16]),ifelse(results0.2[16] < .05," *",""),"\n", "Slope of X1 @ X2 = -1SD: b=",results.lo.2[2],", t=",results.lo.2[10],", p=", sprintf("%5.4f",results.lo.2[14]),ifelse(results.lo.2[14] < .05," *",""),"\n", "Slope of X1 @ X2 = +1SD: b=",results.hi.2[2],", t=",results.hi.2[10],", p=", sprintf("%5.4f",results.hi.2[14]),ifelse(results.hi.2[14] < .05," *",""),"\n", "absolute diff(p) = |",results.hi.2[14]," - ",results.lo.2[14],"| = ", abs(results.hi.2[14] - results.lo.2[14]), "\n", "diff(b) = ",results.hi.2[2]," - ",results.lo.2[2]," = ",results.hi.2[2] - results.lo.2[2], "\n", sep="" )

    In this particular data set, the following estimates and test results are obtained:

    Interaction X1 * X2: b=0.6179825, t=2.037498, p=0.0457 * Slope of X1 @ X2 = -1SD: b=-0.4550505, t=-1.238916, p=0.2199 Slope of X1 @ X2 = +1SD: b=0.4935985, t=1.156491, p=0.2518 absolute diff(p) = |0.251779 - 0.2199011| = 0.03187793 diff(b) = 0.4935985 - -0.4550505 = 0.948649

    The interaction is significant, so the slopes differ. The simple slopes at X2 = -1SD and X2 = +1SD however are both not significant. But look at their signs: they are opposite. So even though the simple slopes may not be strictly significantly different from zero, they differ from each other very clearly. After all the difference in b is much larger than either simple slope b coefficient by itself. The interaction term picks up this difference and is clearly significant.

20140210

Probing simple slopes with polytomous predictors (i.e., categorical with more than two levels)

Simple slopes analysis with continuous and dichotomous predictors is straight forward once you have acquainted yourself with the 'trick' that Aiken & West (1991) describe in detail (i.e., 'shifting' a predictor A such that 0 is where you want the simple slope of B, recalculate all interaction terms with the shifted variable, run the regression analysis with the shifted predictors and interaction terms and read off the slope of predictor B, see this earlier post on probing simple slopes). But what if you have a polytomous predictor, say 'type of employment' which could take four qualitatively different levels:
  • private industry
  • government
  • NGO
  • none
?

Dummy variables

The procedure is simply an extension of the logic for continuous and dichotomous predictors, but it involves the use of dummy coded variables, or rather 'packages' of dummy coded variables. This is because in order to deal with a polytomous predictor with k levels, k-1 dummy variables are needed to represent and model the effect of the polytomous variable. For example, the variable 'type of employment' referred to above would be represented in three dummy variables (k is 4, thus k-1=3 dummy variables). A dummy variable takes the value of 1 for one of the levels each and zero for all other levels:
Original variableDummy Variable 1Dummy Variable 2Dummy Variable 3
1 (private industry)100
2 (government)010
3 (NGO)001
4 (unemployed)000
The category that will get a value of 0 on all dummy variables is the so-called reference category. It is special in that it will not get a dummy variable that assigns a 1 if an observation is in that category, but it is simple the category that is left: In an exhaustive set of k levels for a polytomous variable, once you know that a case is not in any of k-1 categories, it must be in the last, remaining category. That's neat because it saves you one dummy when you accurately and completely describe you observations with regard to the polytomous variable.

Testing the omnibus effect of the polytomous predictor

These dummy variables are generally correlated to a substantial degree. You can see this if you try to guess which value an observation has on a different dummy variable if you know already which value it has on a different dummy variable. Look at 4 fictional observations:
Obs#type of employmentDummy Variable 1Dummy Variable 2Dummy Variable 3
11100
24000
33001
42010
In observation #1, once you know the value on Dummy variable 1 (i.e., DummyVar1 = 1), you know the values on all other Dummy variables - and you know this because of the mathematical structure of the dummies, not because of some empirical association. But also for the remaining three observations with DummyVar1 = 0, you can guess relatively well which value they will have on DummyVar2: In 2/3 of all cases with DummyVar1 = 0, DummyVar2 = 0, and in only 1/3 of the cases with DummyVar1 = 0 is the value of DummyVar2 = 1. So there is substantial overlap of DummyVar1 and DummyVar2 and the argument can be extended to all pairs of dummy variables. So, if all three dummy variables are entered into a regression model, they will 'take away' predictive power from each other. The regression coefficients you would obtain for the individual dummy variables represent the difference between the category that is assigned 1 by the individually considered dummy variable and the category that is assigned 0 by all dummy variables. So it only compares two categories, but the initial endeavor in the regression analysis was to assess the omnibus effect of the original polytomous variable. That omnibus effect does not represent the difference between two of the categories, but all differences between all categories. Therefore, the dummy variables obtained from the polytomous predictor need to be considered simultaneously, all at once. Such a 'package' of dummy variable predictors can be treated like one variable, but you will not obtain a regression coefficient for an effect. Rather you will receive the amount of change in explained variance for the whole package, R2change, and an associated F value with k-1 numerator degrees of freedom, Fchange. Fchange can be tested against zero and if the test is significant, the variance that is captured by the package of dummy variables is so far off zero that you are willing to bet money on it actually not being the result of random fluctuations.
In SPSS syntax, this all works out pretty neatly. First recode the original variable typeOfEmployment into the three dummies D1, D2, D3:
RECODE typeOfEmployment (1=1)(2,3,4=0)(ELSE=SYSMIS) INTO D1. RECODE typeOfEmployment (2=1)(1,3,4=0)(ELSE=SYSMIS) INTO D2. RECODE typeOfEmployment (3=1)(1,2,4=0)(ELSE=SYSMIS) INTO D3. EXEC.
For the sake of the interaction discussion below, let's say we also have a continuous predictor Cont that has already been mean-centered into cCont and a dependent variable Crit. So we need the interaction term for cCont × typeOfEmployment. This interaction term will also be a 'package' of predictors with all possible products from the dummy variables and the the other predictor cCont:
COMPUTE IA_cCont_D1 = cCont*D1. COMPUTE IA_cCont_D2 = cCont*D2. COMPUTE IA_cCont_D3 = cCont*D3. EXEC.
Now the complete regression model would be estimated by:
REGRESSION /STAT=COEF R ANOVA CHANGE /DEPENDENT=Crit /METHOD=ENTER cCont D1 D2 D3 IA_cCont_D1 IA_cCont_D2 IA_cCont_D3.
I will call this model Model 0 below. The coefficients estimated in this model are:
A comma is the decimal separator in this table screenshot in lieu of a dot.
As mentioned above, however, the regression coefficients if D1, D2, D3 are differences between pairs of categories of typeOfEmployment, but we want to know the one effect of this predictor on Crit. Therefore, we structure the predictors in the regression model so that the unique prediction of the dummy variables is assessed as a whole:
REGRESSION /STAT=COEF R ANOVA CHANGE /DEPENDENT=Crit /METHOD=ENTER cCont IA_cCont_D1 IA_cCont_D2 IA_cCont_D3 /METHOD=ENTER D1 D2 D3.
This code estimates first a regression analysis with the predictor terms in the first /METHOD=ENTER portion (note that the dummy variables are not included here, hence: "reduced modelpolytomous predictor"), and then a second model that comprises all the predictors from the first as well as those from the second /METHOD=ENTER section, most notably including the dummy variables. Thus, a "complete model" (i.e., Model 0). The output will report the coefficients for both of these models in two halves of a table that is split horizontally and entitled "Coefficients". But the interesting portion of the output for now is the table entitled "Model summary".
A comma is the decimal separator in this table screenshot in lieu of a dot.
This table contains R2 statistics for both regression analyses from the last command: the reduced model without the dummy variables, as well as for the complete model with the dummy variables added. These two models can now be compared by reading off the R2change for the second (i.e., complete) model: That's the effect of the polytomous predictor expressed as additional variance explained by all dummy variables as a package. R2change is associated with a Fchange (also in that table), and there is a p value for this Fchange under the heading "Sig. F Change" in that table. That's your effect of the polytomous predictor if Cond = MEAN or cCond = 0.

Testing the interaction

Is this a "main effect" of typeOfEmployment? Only if there is no interaction of typeOfEmployment and Cont! So we have yet to test for an interaction of typeOfEmployment and Cont, and that is probably what we are interested in in the first place if we aim to find simple slopes involving a polytomous predictor. In order to test the interaction we run the regression analysis from above again, but this time we estimate a reduced modelinteraction that does not include the interaction term package (but cCont and the simple dummy variables), and then the complete model with all simple predictor terms and the product interaction terms (i.e., Model 0):
REGRESSION /STAT=COEF R ANOVA CHANGE /DEPENDENT=Crit /METHOD=ENTER cCont D1 D2 D3 /METHOD=ENTER IA_cCont_D1 IA_cCont_D2 IA_cCont_D3.
We again get an output including a "Model Summary" table:
A comma is the decimal separator in this table screenshot in lieu of a dot.
Again, the R2change, the Fchange and the associated p value give us an idea of whether there is an interaction of typeOfEmployment and Cont. In this example case, R2change = .094, the Fchange = 5.991, and p = .001: there is a significant interaction. So, the relationship between Cont and Crit is different depending on the value of typeOfEmployment. This does not necessarily mean that such simple slopes (i.e., effects of Cont on Crit conditional on typeOfEmployment) will be significant or not significant at any one particular level of typeOfEmployment. It is even possible that none of the simple slopes are significant. What this interaction means is that among the differences between any pair of conditional slopes of Cont on Crit, there is more variance than to be expected by chance alone if you allow for a type I error rate of less than 5%.

Obtaining the simple slopes

Now that we know that there is evidence of interaction in the example, we are interested in the specific nature of this interaction, or in other words: We want to know what the different simple slopes conditional on typeOfEmployment are. Let's have a look at Model 0: The coefficient associated with cCont is already one of the simple slopes. Namely, it is the slope of Cont at the value of typeOfEmployment that was assigned 0 in all three dummy variables. Hence, in the coefficient output of Model 0 above the regression coefficient B = -0.347 (SE = 0.15, p = .022) is the simple slope of Cont at typeOfEmployment = 4 (i.e. the relationship of Cont among those observations that are "unemployed").
Now it becomes obvious how we can obtain the simple slopes of Cont for any other value of typeOfEmployment: We simply chose different sets of dummy variables to represent the variable typeOfEmployment such that a different value of typeOfEmployment is assigned 0 in all dummy variables of the set:
Original variable Dummy set I Dummy set II Dummy set III Dummy set IV
D1 D2 D3 D1 D2 D4 D1 D3 D4 D2 D3 D4
1 (private industry) 1 0 0 1 0 0 1 0 0 0 0 0
2 (government) 0 1 0 0 1 0 0 0 0 1 0 0
3 (NGO) 0 0 1 0 0 0 0 1 0 0 1 0
4 (unemployed) 0 0 0 0 0 1 0 0 1 0 0 1
In this table of dummy sets, Dummy set I is the set used above, making typeOfEmployment = 4 (unemployed) the zero category and thus allowing to read of the simple slope of Cont on Crit for this group of observations. Dummy set II makes typeOfEmployment = 3 (NGO) the zero category. In order to obatin the simple slope of Cont on Crit for observations with employment in an NGO, this dummy set is used instead of Dummy set I. We have to also calculate the interaction terms, or actually only one of them, because only Dummy II-3 is different from Dummy I-3, the dummies I-1 and I-2 can be reused, as they are the same as II-1 and II-2, and the same is true for the interaction terms IA_cCont_D1 and IA_cCont_D2:
* create Dummy D4. RECODE typeOfEmployment (4=1)(1,2,3=0)(ELSE=SYSMIS) INTO D4. * create interaction term involving Dummy D4. COMPUTE IA_cCont_D4 = cCont*D4. * estimate the simple slope of Cont at typeOfEmployment = 3. REGRESSION /DEPENDENT = Crit /METHOD=ENTER cCont D1 D2 D4 IA_cCont_D1 IA_cCont_D2 IA_cCont_D4.
The regression analysis in this syntax yields the following output of coefficients:
A comma is the decimal separator in this table screenshot in lieu of a dot.
and from this we conveniently read off the simple slope of Cont on Crit at typeOfEmployment = 3 (i.e., the cases that receive 0 on all dummy variables of Dummy set II that is used here): B = 0.10, SE = 0.151, p = .509.
The simple slope for typeOfEmployment = 2 is obtained by recycling Dummy D1 and D3, and adding Dummy D4 - they make up Dummy set III - and their respective interaction terms.
* estimate the simple slope of Cont at typeOfEmployment = 2. REGRESSION /DEPENDENT = Crit /METHOD=ENTER cCont D1 D3 D4 IA_cCont_D1 IA_cCont_D3 IA_cCont_D4.
A comma is the decimal separator in this table screenshot in lieu of a dot.
The simple slope for observations that have employment in government is thus B = 0.451, SE = 0.151, p = .003.
Finally, reusing dummies D1, D3, and D4 (making up Dummy set IV) along with their interaction terms:
* estimate the simple slope of Cont at typeOfEmployment = 1. REGRESSION /DEPENDENT = Crit /METHOD=ENTER cCont D2 D3 D4 IA_cCont_D2 IA_cCont_D3 IA_cCont_D4.
yields:
A comma is the decimal separator in this table screenshot in lieu of a dot.
The simple slope of Cont on Crit at typeOfEmployment = 1 is thus B = 0.357, SE = 0.122, p = .004.
An overview of the simple slope estimates and the test for the interaction thus gives us:
Simple slope at
typeOfEmployment = ...
Dummy set usedDummy terms
(re-)used in syntax
Interaction terms
used
BSEβp
1 (private industry)Dummy set IVD2, D3, D4IA_cCont_D2
IA_cCont_D3
IA_cCont_D4
0.3570.122.37.004
2 (government)Dummy set IIID1, D3, D4IA_cCont_D1
IA_cCont_D3
IA_cCont_D4
0.4510.151.467.003
3 (NGO)Dummy set IID1, D2, D4IA_cCont_D1
IA_cCont_D2
IA_cCont_D4
0.100.151.103.509
4 (unemployed) Dummy set ID1, D2, D3IA_cCont_D1
IA_cCont_D2
IA_cCont_D3
-0.3470.150-.36.022

The procedure has thus given us all the conditional effects of Cont at the different values of typeOfEmployment (i.e., simple slopes) along with their SEs, standardized beta values and p values. Note that they need not form a linear pattern, such that, e.g., the slopes would steadily increase or decrease from typeOfEmployment = 1 to typeOfEmployment = 4. This may be the case, but not necessarily.

Comparing individual simple slopes

Also, for now, we only know that the slopes are heterogeneous (i.e., should not be considered approximately equal and random deviations from a common, equal simple slope). But we do not know whether The slopes in typeOfEmployment = 1 and typeOfEmployment = 2 are different or not. We also do not know if the slopes in typeOfEmployment = 3 and typeOfEmployment = 4 are different or not. (Comparing p values by rule of thumb is not a good test of difference between two slopes, see http://methstuff.blogspot.de/2014/02/my-interaction-is-not-significant-but.html.)
But we can look at the different models to obtain some test of whether the slopes for three of the conditions of typeOfEmployment differ from that in the remaining fourth condition of typeOfEmployment. Specifically the interaction terms will inform us of such differences between pairs of simple slopes. For example, look at the regression model with dummy variables D1, D2, and D4. Condition typeOfEmployment = 3 (i.e., NGO) is the reference group here that is assigned a value of zero on each dummy. The interaction terms therefore indicate how much the simple slopes of Cont in one of the conditions of typeOfEmployment differ from the slope in condition typeOfEmployment = 3:

A comma is the decimal separator in this table screenshot in lieu of a dot.
IA_cCont_D1 has a regression coefficient B = 0.258 (SE = 0.194, p < .185). The slope of Cont in condition 1 of typeOfEmployment is thus 0.258 original scale units larger than the slope in condition 3 of typeOfEmployment, and this difference is not significant, but can be easily explained to be a random fluctuation around a zero difference with 95% confidence. The simple slope in condition 2 is similarly larger descriptively, B = 0.351 (SE = 0.214, p < .102). Finally, as evident from the coefficients for IA_cCont_D4, the simple slope in condition 4 of typeOfEmployment is substantially smaller than the slope in condition 3, B = -0.447 (SE = 0.212, p = .037). This excercise can be extended to all possible pairwise comparisons of the simple slopes using the interaction terms from the different regression models with the different sets of dummy variables. HOWEVER,
  1. The standardized β coefficients for these product terms that SPSS (and other software packages for that matter) yields are not correct. Do not use them. If you want them, run the same model again, but with a) all continuous predictors standardized, b) the dummy variables not standardized, c) the product terms from the standardized continuous predictors and the unstandardized dummy variables, and d) the standardized dependent variable Crit. The unstandardized coefficients from the output of this special regression model are in fact the standardized coefficients for the original regression model.
  2. be aware that you are running 4 (regression models) × 3 (comparisons within each model) = 12 such post-hoc tests and thus more than overuse the degrees of freedom available in the design.

Using contrasts to test simple slope patterns

A different approach to testing patterns of simple slopes may be to use, instead of dummy variables, orthogonal contrasts that allow to specifically test a particular pattern of simple slopes. For example you could test whether indeed the simple slopes in conditions 1 and 2 are largest and not different from each other, condition 3 has a medium slope and condition 4 has the smallest by using:

* Focal contrast KF. RECODE typeOfEmployment (1,2=1)(3=0)(4=-2)(ELSE=SYSMIS) INTO KF. *Residual contrasts Kr1 and Kr2. RECODE typeOfEmployment (1=1)(2=-1)(3,4=0)(ELSE=SYSMIS) INTO Kr1. RECODE typeOfEmployment (1,2,4=1)(3=-3)(ELSE=SYSMIS) INTO Kr2. *Create the product terms. COMPUTE P_KF_cCont = cCont*KF. COMPUTE P_Kr1_cCont = cCont*Kr1. COMPUTE P_Kr2_cCont = cCont*Kr2.
and then estimating
REGRESSION /DEPENDENT Crit /METHOD = ENTER cCont KF Kr1 Kr2 P_KF_cCont P_Kr1_cCont P_Kr2_cCont.

A comma is the decimal separator in this table screenshot in lieu of a dot.

The regression coefficient of P_KF_cCont indicates whether the simple slopes across conditions of typeOfEmployment behave in the predicted way, and the two coefficients for P_Kr1_cCont and P_Kr2_cCont are far from any conventional significance levels so for now this can be taken that there are no other patterns among the simple slopes except the one described in KF.

Average effect of the "other" predictor

The four regression models with the different sets of dummy variables all yield a simple slope of Cont, i.e., an effect of Cont conditional on typeOfEmployment. So there is never a test of an average or "overall" effect of Cont. But we might want to know this average effect. In order to obtain it, run the regression Model 0 again, but not with typeOfEmployment dummy coded, but effect coded. There are also k-1 effect codes for k levels of a polytomous variable, and each of them assigns 1 to the same level in each variable, -1 to one of the remaining levels, and 0 to all other levels, respectively. There are four different sets of such effect codes:

Original variable   Effect code set 1   Effect code set 2   Effect code set 3   Effect code set 4
typeOfEmployment   E1_1 E1_2 E1_3   E2_1 E2_2 E2_3   E3_1 E3_2 E3_3   E4_1 E4_2 E4_3
1 (private industry)   1 1 1   -1 0 0   -1 0 0   -1 0 0
2 (government)   -1 0 0   1 1 1   0 -1 0   0 -1 0
3 (NGO)   0 -1 0   0 -1 0   1 1 1   0 0 -1
4 (unemployed)   0 0 -1   0 0 -1   0 0 -1   1 1 1

It does not matter which of these sets you take, they all yield the same result. Recode the original typeOfEmployment into the effect code set chosen, e.g., Effect code set 2:

RECODE typeOfEmployment (1=-1)(2=1)(3,4=0)(ELSE=SYSMIS) INTO E2_1. RECODE typeOfEmployment (1,4=0)(2=1)(3=-1)(ELSE=SYSMIS) INTO E2_2. RECODE typeOfEmployment (1,3=0)(2=1)(4=-1)(ELSE=SYSMIS) INTO E2_3.

then calculate the product terms again:

COMPUTE IA_cCont_E2_1 = cCont*E2_1. COMPUTE IA_cCont_E2_2 = cCont*E2_2. COMPUTE IA_cCont_E2_3 = cCont*E2_3.

and run the regression model with these terms:

REGRESSION /DEPENDENT Crit /METHOD = ENTER cCont E2_1 E2_2 E2_3 IA_cCont_E2_1 IA_cCont_E2_2 IA_cCont_E2_3.

A comma is the decimal separator in this table screenshot in lieu of a dot.

The regression coefficient for cCont in this regression model, B = 0.14, SE = 0.72, β = .145, p = .053, is the effect of Cont on Crit across all conditions of typeOfEmployment. Remember that this effect is not a main effect since the interaction shows that it is not an effect of Cont that is present regardless of the value of other predictors in the model (which is the definition of a main effect). Therefore the interpretation of this average effect should be cautious.

20130919

spss code for decomposition of a significant three-way interaction into two-way interactions, preserving degrees of freedom

this is a "trick" to make spss resolve two-way interactions in the presence of a significant three-way interaction in an anova. a more general way that can also deal with continuous predictors is described in aiken & west:

Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Thousand Oaks: Sage.

and elaborated on in How to probe simple and complex slopes in a regression analysis with three (or more) predictors using SPSS in this here blog.
MANOVA dv BY iv1(x,y) iv2(a,b) iv3(o,p) /ERROR=WITHIN /DESIGN=iv1*iv2 WITHIN iv3(o) iv1*iv2 WITHIN iv3(p) /*[and so on...]*/.

where

dvdependent var
iv1, iv2independent variables
iv3independent variable for whose levels you decompose the three-way into two-way interactions (iv1*iv2)
xfirst level of iv1
ylast level of iv1
afirst level of iv2
blast level of iv2
ofirst level of iv3
plast level of iv3

IMPORTANT: iv1 and iv2 need to be coded in consecutive integers corresponding to x,y,a,b.

you can specify any simple or higher-order effects at a level of another variable in the /DESIGN= command.