I was teaching on a field course and when the students started analysing their data in R, one of them noticed that if they switched around the independent variables in an lm()
they got different results with different methods of computing analysis of variance tables. I wanted to investigate it more, and this is the resulting R Markdown report that I wrote:
The report can be found here and is also pasted below
Clear console
cat("\014")
Add factors for the anova analyses
mtcars$group <- mtcars$cyl %>% gsub(4, "A", .) %>% gsub(6, "B", .) %>% gsub(8, "C", .) %>% as.factor(.)
mtcars$group2 <- as.factor(rep(c("D", "E", "F", "G"), times = 8))
mtcars$group3 <- as.factor(rep(c("H", "I"), times = 16))
mtcars$group4 <- as.factor(rep(c("J", "J", "K", "K", "L", "L", "M", "M", "J", "K", "L", "M", "M", "L", "K", "J"), times = 2))
head(mtcars)
summary(mod1)
mod <- lm(y ~ continuous + grouping)
Anova(mod, method = "II")