Count sketch

Count sketch is a type of dimensionality reduction that is particularly efficient in statistics, machine learning and algorithms.[1][2] It was invented by Moses Charikar, Kevin Chen and Martin Farach-Colton[3] in an effort to speed up the AMS Sketch by Alon, Matias and Szegedy for approximating the frequency moments of streams[4] (these calculations require counting of the number of occurrences for the distinct elements of the stream).

The sketch is nearly identical[citation needed] to the Feature hashing algorithm by John Moody,[5] but differs in its use of hash functions with low dependence, which makes it more practical. In order to still have a high probability of success, the median trick is used to aggregate multiple count sketches, rather than the mean.

These properties allow use for explicit kernel methods, bilinear pooling in neural networks and is a cornerstone in many numerical linear algebra algorithms.[6]

  1. ^ Faisal M. Algashaam; Kien Nguyen; Mohamed Alkanhal; Vinod Chandran; Wageeh Boles. "Multispectral Periocular Classification WithMultimodal Compact Multi-Linear Pooling" [1]. IEEE Access, Vol. 5. 2017.
  2. ^ Ahle, Thomas; Knudsen, Jakob (2019-09-03). "Almost Optimal Tensor Sketch". ResearchGate. Retrieved 2020-07-11.
  3. ^ Charikar, Chen & Farach-Colton 2004.
  4. ^ Alon, Noga, Yossi Matias, and Mario Szegedy. "The space complexity of approximating the frequency moments." Journal of Computer and system sciences 58.1 (1999): 137-147.
  5. ^ Moody, John. "Fast learning in multi-resolution hierarchies." Advances in neural information processing systems. 1989.
  6. ^ Woodruff, David P. "Sketching as a Tool for Numerical Linear Algebra." Theoretical Computer Science 10.1-2 (2014): 1–157.