Mahalanobis distance

The Mahalanobis distance is a measure of the distance between a point and a distribution , introduced by P. C. Mahalanobis in 1933.[1] The mathematical details of Mahalanobis distance first appeared in the Journal of The Asiatic Society of Bengal in 1933.[2] Mahalanobis's definition was prompted by the problem of identifying the similarities of skulls based on measurements (the earliest work related to similarities of skulls are from 1922 and another later work is from 1927).[3][4] R.C. Bose later obtained the sampling distribution of Mahalanobis distance, under the assumption of equal dispersion.[5]

It is a multivariate generalization of the square of the standard score : how many standard deviations away is from the mean of . This distance is zero for at the mean of and grows as moves away from the mean along each principal component axis. If each of these axes is re-scaled to have unit variance, then the Mahalanobis distance corresponds to standard Euclidean distance in the transformed space. The Mahalanobis distance is thus unitless, scale-invariant, and takes into account the correlations of the data set.

  1. ^ "Reprint of: Mahalanobis, P.C. (1936) "On the Generalised Distance in Statistics."". Sankhya A. 80 (1): 1–7. 2018-12-01. doi:10.1007/s13171-019-00164-5. ISSN 0976-8378.
  2. ^ Journal and Procedings Of The Asiatic Society Of Bengal Vol-xxvi. Asiatic Society Of Bengal Calcutta. 1933.
  3. ^ Mahalanobis, Prasanta Chandra (1922). Anthropological Observations on the Anglo-Indians of Culcutta---Analysis of Male Stature.
  4. ^ Mahalanobis, Prasanta Chandra (1927). "Analysis of race mixture in Bengal". Journal and Proceedings of the Asiatic Society of Bengal. 23: 301–333.
  5. ^ Science And Culture (1935-36) Vol. 1. Indian Science News Association. 1935. pp. 205–206.