Feature hashing

In machine learning, feature hashing, also known as the hashing trick (by analogy to the kernel trick), is a fast and space-efficient way of vectorizing features, i.e. turning arbitrary features into indices in a vector or matrix.[1][2] It works by applying a hash function to the features and using their hash values as indices directly (after a modulo operation), rather than looking the indices up in an associative array. In addition to its use for encoding non-numeric values, feature hashing can also be used for dimensionality reduction.[2]

This trick is often attributed to Weinberger et al. (2009),[2] but there exists a much earlier description of this method published by John Moody in 1989.[1]

  1. ^ a b Moody, John (1989). "Fast learning in multi-resolution hierarchies" (PDF). Advances in Neural Information Processing Systems.
  2. ^ a b c Cite error: The named reference Weinberger was invoked but never defined (see the help page).