In a scientific study, post hoc analysis (from Latin post hoc, "after this") consists of statistical analyses that were specified after the data were seen.[1][2] They are usually used to uncover specific differences between three or more group means when an analysis of variance (ANOVA) test is significant.[3] This typically creates a multiple testing problem because each potential analysis is effectively a statistical test. Multiple testing procedures are sometimes used to compensate, but that is often difficult or impossible to do precisely. Post hoc analysis that is conducted and interpreted without adequate consideration of this problem is sometimes called data dredging by critics because the statistical associations that it finds are often spurious.[4]
Post hoc analyses are not inherently bad or good;[5]: 12–13 rather, the main requirement for their ethical use is simply that their results not be mispresented as the original hypothesis.[5]: 12–13 Modern editions of scientific manuals have clarified this point; for example, APA style now specifies that "hypotheses should now be stated in three groupings: preplanned–primary, preplanned–secondary, and exploratory (post hoc). Exploratory hypotheses are allowable, and there should be no pressure to disguise them as if they were preplanned."[5]: 12–13