FASTA format

FASTA format
Filename extensions
.fasta, .fas, .fa, .fna, .ffn, .faa, .mpfa, .frn
Internet media type
text/x-fasta
Uniform Type Identifier (UTI)no
Developed byDavid J. Lipman
William R. Pearson[1][2]
Initial release1985
Type of formatBioinformatics
Extended fromASCII for FASTA
Extended toFASTQ format[3]
Websitewww.ncbi.nlm.nih.gov/BLAST/fasta.shtml

In bioinformatics and biochemistry, the FASTA format is a text-based format for representing either nucleotide sequences or amino acid (protein) sequences, in which nucleotides or amino acids are represented using single-letter codes.

The format allows for sequence names and comments to precede the sequences. It originated from the FASTA software package and has since become a near-universal standard in bioinformatics.[4]

The simplicity of FASTA format makes it easy to manipulate and parse sequences using text-processing tools and scripting languages.

  1. ^ Lipman DJ, Pearson WR (March 1985). "Rapid and sensitive protein similarity searches". Science. 227 (4693): 1435–41. Bibcode:1985Sci...227.1435L. doi:10.1126/science.2983426. PMID 2983426. Closed access icon
  2. ^ Pearson WR, Lipman DJ (April 1988). "Improved tools for biological sequence comparison". Proceedings of the National Academy of Sciences of the United States of America. 85 (8): 2444–8. Bibcode:1988PNAS...85.2444P. doi:10.1073/pnas.85.8.2444. PMC 280013. PMID 3162770.
  3. ^ Cite error: The named reference fastq was invoked but never defined (see the help page).
  4. ^ "What is FASTA format?". Zhang Lab. Archived from the original on 2022-12-04. Retrieved 2022-12-04.