RefSeq

Refseq
Content
Description	curated non-redundant sequence database of genomes.
Contact
Research center	National Center for Biotechnology Information
Primary citation	Pruitt KD & al. (2005)
Access
Website	https://www.ncbi.nlm.nih.gov/RefSeq

The Reference Sequence (RefSeq) database^[1] is an open access, annotated and curated collection of publicly available nucleotide sequences (DNA, RNA) and their protein products. RefSeq was introduced in 2000.^[2]^[3] This database is built by National Center for Biotechnology Information (NCBI), and, unlike GenBank, provides only a single record for each natural biological molecule (i.e. DNA, RNA or protein) for major organisms ranging from viruses to bacteria to eukaryotes.

For each model organism, RefSeq aims to provide separate and linked records for the genomic DNA, the gene transcripts, and the proteins arising from those transcripts. RefSeq is limited to major organisms for which sufficient data are available (121,461 distinct "named" organisms as of July 2022),^[4] while GenBank includes sequences for any organism submitted (approximately 504,000 formally described species).^[5]

^ ^a ^b Pruitt KD, Tatusova T, Maglott DR (January 2005). "NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins". Nucleic Acids Research. 33 (Database issue): D501–D504. doi:10.1093/nar/gki025. PMC 539979. PMID 15608248.
^ Maglott DR, Katz KS, Sicotte H, Pruitt KD (January 2000). "NCBI's LocusLink and RefSeq". Nucleic Acids Research. 28 (1): 126–128. doi:10.1093/nar/28.1.126. PMC 102393. PMID 10592200.
^ Pruitt KD, Katz KS, Sicotte H, Maglott DR (January 2000). "Introducing RefSeq and LocusLink: curated human genome resources at the NCBI". Trends in Genetics. 16 (1): 44–47. doi:10.1016/s0168-9525(99)01882-x. PMID 10637631.
^ RefSeq Release 213 Statistics (Report). National Library of Medicine. 11 July 2022. Retrieved 20 July 2022.
^ Sayers EW, Cavanaugh M, Clark K, Pruitt KD, Schoch CL, Sherry ST, Karsch-Mizrachi I (January 2022). "GenBank". Nucleic Acids Research. 50 (D1): D161–D164. doi:10.1093/nar/gkab1135. PMC 8690257. PMID 34850943.

[pmid15608248-1] Pruitt KD, Tatusova T, Maglott DR (January 2005). "NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins". Nucleic Acids Research. 33 (Database issue): D501–D504. doi:10.1093/nar/gki025. PMC 539979. PMID 15608248.

[2] Maglott DR, Katz KS, Sicotte H, Pruitt KD (January 2000). "NCBI's LocusLink and RefSeq". Nucleic Acids Research. 28 (1): 126–128. doi:10.1093/nar/28.1.126. PMC 102393. PMID 10592200.

[pmid10637631-3] Pruitt KD, Katz KS, Sicotte H, Maglott DR (January 2000). "Introducing RefSeq and LocusLink: curated human genome resources at the NCBI". Trends in Genetics. 16 (1): 44–47. doi:10.1016/s0168-9525(99)01882-x. PMID 10637631.

[:0-4] RefSeq Release 213 Statistics (Report). National Library of Medicine. 11 July 2022. Retrieved 20 July 2022.

[5] Sayers EW, Cavanaugh M, Clark K, Pruitt KD, Schoch CL, Sherry ST, Karsch-Mizrachi I (January 2022). "GenBank". Nucleic Acids Research. 50 (D1): D161–D164. doi:10.1093/nar/gkab1135. PMC 8690257. PMID 34850943.

[1]

[2]

[3]

[4]

[5]