Asymmetric numeral systems

Asymmetric numeral systems (ANS)[1][2] is a family of entropy encoding methods introduced by Jarosław (Jarek) Duda[3] from Jagiellonian University, used in data compression since 2014[4] due to improved performance compared to previous methods.[5] ANS combines the compression ratio of arithmetic coding (which uses a nearly accurate probability distribution), with a processing cost similar to that of Huffman coding. In the tabled ANS (tANS) variant, this is achieved by constructing a finite-state machine to operate on a large alphabet without using multiplication.

Among others, ANS is used in the Facebook Zstandard compressor[6][7] (also used e.g. in Linux kernel,[8] Google Chrome browser,[9] Android[10] operating system, was published as RFC 8478 for MIME[11] and HTTP[12]), Apple LZFSE compressor,[13] Google Draco 3D compressor[14] (used e.g. in Pixar Universal Scene Description format[15]) and PIK image compressor,[16] CRAM DNA compressor[17] from SAMtools utilities,[18] NVIDIA nvCOMP high speed compression library,[19] Dropbox DivANS compressor,[20] Microsoft DirectStorage BCPack texture compressor,[21] and JPEG XL[22] image compressor.

The basic idea is to encode information into a single natural number . In the standard binary number system, we can add a bit of information to by appending at the end of , which gives us . For an entropy coder, this is optimal if . ANS generalizes this process for arbitrary sets of symbols with an accompanying probability distribution . In ANS, if the information from is appended to to result in , then . Equivalently, , where is the number of bits of information stored in the number , and is the number of bits contained in the symbol .

For the encoding rule, the set of natural numbers is split into disjoint subsets corresponding to different symbols – like into even and odd numbers, but with densities corresponding to the probability distribution of the symbols to encode. Then to add information from symbol into the information already stored in the current number , we go to number being the position of the -th appearance from the -th subset.

There are alternative ways to apply it in practice – direct mathematical formulas for encoding and decoding steps (uABS and rANS variants), or one can put the entire behavior into a table (tANS variant). Renormalization is used to prevent going to infinity – transferring accumulated bits to or from the bitstream.

  1. ^ J. Duda, K. Tahboub, N. J. Gadil, E. J. Delp, The use of asymmetric numeral systems as an accurate replacement for Huffman coding, Picture Coding Symposium, 2015.
  2. ^ J. Duda, Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding, arXiv:1311.2540, 2013.
  3. ^ "Dr Jarosław Duda (Jarek Duda)". Institute of Theoretical Physics. Jagiellonian University in Krakow. Retrieved 2021-08-02.
  4. ^ Duda, Jarek (October 6, 2019). "List of compressors using ANS, implementations and other materials". Retrieved October 6, 2019.
  5. ^ "Google Accused of Trying to Patent Public Domain Technology". Bleeping Computer. September 11, 2017.
  6. ^ Smaller and faster data compression with Zstandard, Facebook, August 2016.
  7. ^ 5 ways Facebook improved compression at scale with Zstandard, Facebook, December 2018.
  8. ^ Zstd Compression For Btrfs & Squashfs Set For Linux 4.14, Already Used Within Facebook, Phoronix, September 2017.
  9. ^ New in Chrome 123 (Content-Encoding), Google, March 2024.
  10. ^ "Zstd in Android P release". Archived from the original on 2020-08-26. Retrieved 2019-05-29.
  11. ^ Zstandard Compression and The application/zstd Media Type (email standard).
  12. ^ Hypertext Transfer Protocol (HTTP) Parameters, IANA.
  13. ^ Apple Open-Sources its New Compression Algorithm LZFSE, InfoQ, July 2016.
  14. ^ Google Draco 3D compression library.
  15. ^ Google and Pixar add Draco Compression to Universal Scene Description (USD) Format .
  16. ^ Google PIK: new lossy image format for the internet.
  17. ^ CRAM format specification (version 3.0).
  18. ^ Chen W, Elliott LT (2021). "Compression for population genetic data through finite-state entropy". J Bioinform Comput Biol. 19 (5): 2150026. doi:10.1142/S0219720021500268. PMID 34590992.
  19. ^ High Speed Data Compression Using NVIDIA GPUs.
  20. ^ Building better compression together with DivANS.
  21. ^ Microsoft DirectStorage overview.
  22. ^ Rhatushnyak, Alexander; Wassenberg, Jan; Sneyers, Jon; Alakuijala, Jyrki; Vandevenne, Lode; Versari, Luca; Obryk, Robert; Szabadka, Zoltan; Kliuchnikov, Evgenii; Comsa, Iulia-Maria; Potempa, Krzysztof; Bruse, Martin; Firsching, Moritz; Khasanova, Renata; Ruud van Asseldonk; Boukortt, Sami; Gomez, Sebastian; Fischbacher, Thomas (2019). "Committee Draft of JPEG XL Image Coding System". arXiv:1908.03565 [eess.IV].