This article needs additional citations for verification. (February 2017) |
In causal models, controlling for a variable means binning data according to measured values of the variable. This is typically done so that the variable can no longer act as a confounder in, for example, an observational study or experiment.
When estimating the effect of explanatory variables on an outcome by regression, controlled-for variables are included as inputs in order to separate their effects from the explanatory variables.[1]
A limitation of controlling for variables is that a causal model is needed to identify important confounders (backdoor criterion is used for the identification). Without having one, a possible confounder might remain unnoticed. Another associated problem is that if a variable which is not a real confounder is controlled for, it may in fact make other variables (possibly not taken into account) become confounders while they were not confounders before. In other cases, controlling for a non-confounding variable may cause underestimation of the true causal effect of the explanatory variables on an outcome (e.g. when controlling for a mediator or its descendant).[2][3] Counterfactual reasoning mitigates the influence of confounders without this drawback.[3]