Data augmentation

Data augmentation is a statistical technique which allows maximum likelihood estimation from incomplete data.[1][2] Data augmentation has important applications in Bayesian analysis,[3] and the technique is widely used in machine learning to reduce overfitting when training machine learning models,[4] achieved by training models on several slightly-modified copies of existing data.

  1. ^ Dempster, A.P.; Laird, N.M.; Rubin, D.B. (1977). "Maximum Likelihood from Incomplete Data Via the EM Algorithm". Journal of the Royal Statistical Society. Series B (Methodological). 39 (1): 1–22. doi:10.1111/j.2517-6161.1977.tb01600.x. Archived from the original on 2022-10-10. Retrieved 2024-08-28.
  2. ^ Rubin, Donald (1987). "Comment: The Calculation of Posterior Distributions by Data Augmentation". Journal of the American Statistical Association. 82 (398). doi:10.2307/2289460. JSTOR 2289460. Archived from the original on 2024-08-07. Retrieved 2024-08-28.
  3. ^ Jackman, Simon (2009). Bayesian Analysis for the Social Sciences. John Wiley & Sons. p. 236. ISBN 978-0-470-01154-6.
  4. ^ Shorten, Connor; Khoshgoftaar, Taghi M. (2019). "A survey on Image Data Augmentation for Deep Learning". Mathematics and Computers in Simulation. 6. springer: 60. doi:10.1186/s40537-019-0197-0.