In statistics, confirmatory factor analysis (CFA) is a special form of factor analysis, most commonly used in social science research.[1] It is used to test whether measures of a construct are consistent with a researcher's understanding of the nature of that construct (or factor). As such, the objective of confirmatory factor analysis is to test whether the data fit a hypothesized measurement model. This hypothesized model is based on theory and/or previous analytic research.[2] CFA was first developed by Jöreskog (1969)[3] and has built upon and replaced older methods of analyzing construct validity such as the MTMM Matrix as described in Campbell & Fiske (1959).[4]
In confirmatory factor analysis, the researcher first develops a hypothesis about what factors they believe are underlying the measures used (e.g., "Depression" being the factor underlying the Beck Depression Inventory and the Hamilton Rating Scale for Depression) and may impose constraints on the model based on these a priori hypotheses. By imposing these constraints, the researcher is forcing the model to be consistent with their theory. For example, if it is posited that there are two factors accounting for the covariance in the measures, and that these factors are unrelated to each other, the researcher can create a model where the correlation between factor A and factor B is constrained to zero. Model fit measures could then be obtained to assess how well the proposed model captured the covariance between all the items or measures in the model. If the constraints the researcher has imposed on the model are inconsistent with the sample data, then the results of statistical tests of model fit will indicate a poor fit, and the model will be rejected. If the fit is poor, it may be due to some items measuring multiple factors. It might also be that some items within a factor are more related to each other than others.
For some applications, the requirement of "zero loadings" (for indicators not supposed to load on a certain factor) has been regarded as too strict. A newly developed analysis method, "exploratory structural equation modeling", specifies hypotheses about the relation between observed indicators and their supposed primary latent factors while allowing for estimation of loadings with other latent factors as well.[5]