Matching (statistics)

Matching is a statistical technique that evaluates the effect of a treatment by comparing the treated and the non-treated units in an observational study or quasi-experiment (i.e. when the treatment is not randomly assigned). The goal of matching is to reduce bias for the estimated treatment effect in an observational-data study, by finding, for every treated unit, one (or more) non-treated unit(s) with similar observable characteristics against which the covariates are balanced out (similar to the K-nearest neighbors algorithm). By matching treated units to similar non-treated units, matching enables a comparison of outcomes among treated and non-treated units to estimate the effect of the treatment reducing bias due to confounding.[1][2][3] Propensity score matching, an early matching technique, was developed as part of the Rubin causal model,[4] but has been shown to increase model dependence, bias, inefficiency, and power and is no longer recommended compared to other matching methods.[5] A simple, easy-to-understand, and statistically powerful method of matching known as Coarsened Exact Matching or CEM.[6]

Matching has been promoted by Donald Rubin.[4] It was prominently criticized in economics by Robert LaLonde (1986),[7] who compared estimates of treatment effects from an experiment to comparable estimates produced with matching methods and showed that matching methods are biased. Rajeev Dehejia and Sadek Wahba (1999) reevaluated LaLonde's critique and showed that matching is a good solution.[8] Similar critiques have been raised in political science[9] and sociology[10] journals.

  1. ^ Rubin, Donald B. (1973). "Matching to Remove Bias in Observational Studies". Biometrics. 29 (1): 159–183. doi:10.2307/2529684. JSTOR 2529684.
  2. ^ Anderson, Dallas W.; Kish, Leslie; Cornell, Richard G. (1980). "On Stratification, Grouping and Matching". Scandinavian Journal of Statistics. 7 (2): 61–66. JSTOR 4615774.
  3. ^ Kupper, Lawrence L.; Karon, John M.; Kleinbaum, David G.; Morgenstern, Hal; Lewis, Donald K. (1981). "Matching in Epidemiologic Studies: Validity and Efficiency Considerations". Biometrics. 37 (2): 271–291. CiteSeerX 10.1.1.154.1197. doi:10.2307/2530417. JSTOR 2530417. PMID 7272415.
  4. ^ a b Rosenbaum, Paul R.; Rubin, Donald B. (1983). "The Central Role of the Propensity Score in Observational Studies for Causal Effects". Biometrika. 70 (1): 41–55. doi:10.1093/biomet/70.1.41.
  5. ^ King, Gary; Nielsen, Richard (October 2019). "Why Propensity Scores Should Not Be Used for Matching". Political Analysis. 27 (4): 435–454. doi:10.1017/pan.2019.11. hdl:1721.1/128459. ISSN 1047-1987.
  6. ^ Iacus, Stefano M.; King, Gary; Porro, Giuseppe (2011). "Multivariate Matching Methods That Are Monotonic Imbalance Bounding". Journal of the American Statistical Association. 106 (493): 345–361. doi:10.1198/jasa.2011.tm09599. hdl:2434/151476. ISSN 0162-1459. S2CID 14790456.
  7. ^ LaLonde, Robert J. (1986). "Evaluating the Econometric Evaluations of Training Programs with Experimental Data". American Economic Review. 76 (4): 604–620. JSTOR 1806062.
  8. ^ Dehejia, R. H.; Wahba, S. (1999). "Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs" (PDF). Journal of the American Statistical Association. 94 (448): 1053–1062. doi:10.1080/01621459.1999.10473858.
  9. ^ Arceneaux, Kevin; Gerber, Alan S.; Green, Donald P. (2006). "Comparing Experimental and Matching Methods Using a Large-Scale Field Experiment on Voter Mobilization". Political Analysis. 14 (1): 37–62. doi:10.1093/pan/mpj001.
  10. ^ Arceneaux, Kevin; Gerber, Alan S.; Green, Donald P. (2010). "A Cautionary Note on the Use of Matching to Estimate Causal Effects: An Empirical Example Comparing Matching Estimates to an Experimental Benchmark". Sociological Methods & Research. 39 (2): 256–282. doi:10.1177/0049124110378098. S2CID 37012563.