Original author(s) | Heng Li |
---|---|
Developer(s) | John Marshall and Petr Danecek et al [1] |
Initial release | 2009 |
Stable release | 1.20
/ April 15, 2024[2] |
Repository | |
Written in | C |
Operating system | Unix-like |
Type | Bioinformatics |
License | BSD, MIT |
Website | www |
SAMtools is a set of utilities for interacting with and post-processing short DNA sequence read alignments in the SAM (Sequence Alignment/Map), BAM (Binary Alignment/Map) and CRAM formats, written by Heng Li. These files are generated as output by short read aligners like BWA. Both simple and advanced tools are provided, supporting complex tasks like variant calling and alignment viewing as well as sorting, indexing, data extraction and format conversion.[3] SAM files can be very large (tens of Gigabytes is common), so compression is used to save space. SAM files are human-readable text files, and BAM files are simply their binary equivalent, whilst CRAM files are a restructured column-oriented binary container format. BAM files are typically compressed and more efficient for software to work with than SAM. SAMtools makes it possible to work directly with a compressed BAM file, without having to uncompress the whole file. Additionally, since the format for a SAM/BAM file is somewhat complex - containing reads, references, alignments, quality information, and user-specified annotations - SAMtools reduces the effort needed to use SAM/BAM files by hiding low-level details.
As third-party projects were trying to use code from SAMtools despite it not being designed to be embedded in that way, the decision was taken in August 2014 to split the SAMtools package into a stand-alone software library with a well-defined API (HTSlib),[4] a project for variant calling and manipulation of variant data (BCFtools), and the stand-alone SAMtools package for working with sequence alignment data.[5]