GenBank

GenBank
Content
DescriptionNucleotide sequences for more than 300,000 organisms with supporting bibliographic and biological annotation.
Data types
captured
  • Nucleotide sequence
  • Protein sequence
OrganismsAll
Contact
Research centerNCBI
Primary citationPMID 21071399
Release date1982; 42 years ago (1982)
Access
Data format
WebsiteNCBI
Download URLncbi ftp
Web service URL
Tools
WebBLAST
StandaloneBLAST
Miscellaneous
LicenseUnclear[1]

The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. It is produced and maintained by the National Center for Biotechnology Information (NCBI; a part of the National Institutes of Health in the United States) as part of the International Nucleotide Sequence Database Collaboration (INSDC).

GenBank and its collaborators will receive sequences produced in laboratories throughout the world from more than 500,000 formally described species.[2] The database started in 1982 by Walter Goad and Los Alamos National Laboratory. GenBank has become an important database for research in biological fields and has grown in recent years at an exponential rate by doubling roughly every 18 months.[3][4]

Release 250.0, published in June 2022, contained over 17 trillion nucleotide bases in more than 2,45 billion sequences.[5] GenBank is built by direct submissions from individual laboratories, as well as from bulk submissions from large-scale sequencing centers.

  1. ^ The download page at UCSC says "NCBI places no restrictions on the use or distribution of the GenBank data. However, some submitters may claim patent, copyright, or other intellectual property rights in all or a portion of the data they have submitted. NCBI is not in a position to assess the validity of such claims, and therefore cannot provide comment or unrestricted permission concerning the use, copying, or distribution of the information contained in GenBank."
  2. ^ Eric W Sayers; Mark Cavanaugh; Karen Clark; Kim D Pruitt; Conrad L Schoch; Stephen T Sherry; Ilene Karsch-Mizrachi (7 January 2022). "GenBank". Nucleic Acids Archive. 50 (D1): D161–D164. doi:10.1093/nar/gkab1135. PMC 8690257. PMID 34850943.
  3. ^ Benson D; Karsch-Mizrachi, I.; Lipman, D. J.; Ostell, J.; Wheeler, D. L.; et al. (2008). "GenBank". Nucleic Acids Research. 36 (Database): D25–D30. doi:10.1093/nar/gkm929. PMC 2238942. PMID 18073190.
  4. ^ Benson D; Karsch-Mizrachi, I.; Lipman, D. J.; Ostell, J.; Sayers, E. W.; et al. (2009). "GenBank". Nucleic Acids Research. 37 (Database): D26–D31. doi:10.1093/nar/gkn723. PMC 2686462. PMID 18940867.
  5. ^ "GenBank release notes (Release 250)". NCBI. 15 June 2022. Retrieved 20 July 2022.