Earth mover's distance

In computer science, the earth mover's distance (EMD)[1] is a measure of dissimilarity between two frequency distributions, densities, or measures, over a metric space D. Informally, if the distributions are interpreted as two different ways of piling up earth (dirt) over D, the EMD captures the minimum cost of building the smaller pile using dirt taken from the larger, where cost is defined as the amount of dirt moved multiplied by the distance over which it is moved.

Over probability distributions, the earth mover's distance is also known as the Wasserstein metric , KantorovichRubinstein metric, or Mallows's distance.[2] It is the solution of the optimal transport problem, which in turn is also known as the Monge-Kantorovich problem, or sometimes the HitchcockKoopmans transportation problem;[3] when the measures are uniform over a set of discrete elements, the same optimization problem is known as minimum weight bipartite matching.

  1. ^ Rubner, Y.; Tomasi, C.; Guibas, L.J. (1998). "A metric for distributions with applications to image databases". Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271). Narosa Publishing House. pp. 59–66. doi:10.1109/iccv.1998.710701. ISBN 81-7319-221-9. S2CID 18648233.
  2. ^ C. L. Mallows (1972). "A note on asymptotic joint normality". Annals of Mathematical Statistics. 43 (2): 508–515. doi:10.1214/aoms/1177692631.
  3. ^ Singiresu S. Rao (2009). Engineering Optimization: Theory and Practice (4th ed.). John Wiley & Sons. p. 221. ISBN 978-0-470-18352-6.