Feature hashing

In machine learning, feature hashing, also known as the hashing trick (by analogy to the kernel trick), is a fast and space-efficient way of vectorizing features, i.e. turning arbitrary features into indices in a vector or matrix.^[1]^[2] It works by applying a hash function to the features and using their hash values as indices directly (after a modulo operation), rather than looking the indices up in an associative array. In addition to its use for encoding non-numeric values, feature hashing can also be used for dimensionality reduction.^[2]

This trick is often attributed to Weinberger et al. (2009),^[2] but there exists a much earlier description of this method published by John Moody in 1989.^[1]

^ ^a ^b Moody, John (1989). "Fast learning in multi-resolution hierarchies" (PDF). Advances in Neural Information Processing Systems.
^ ^a ^b ^c Cite error: The named reference Weinberger was invoked but never defined (see the help page).

[Moody-1] Moody, John (1989). "Fast learning in multi-resolution hierarchies" (PDF). Advances in Neural Information Processing Systems.

[Weinberger-2] Cite error: The named reference Weinberger was invoked but never defined (see the help page).

[1]

[2]