Reference genome

The first printout of the human reference genome presented as a series of books, displayed at the Wellcome Collection, London

A reference genome (also known as a reference assembly) is a digital nucleic acid sequence database, assembled by scientists as a representative example of the set of genes in one idealized individual organism of a species. As they are assembled from the sequencing of DNA from a number of individual donors, reference genomes do not accurately represent the set of genes of any single individual organism. Instead, a reference provides a haploid mosaic of different DNA sequences from each donor. For example, one of the most recent human reference genomes, assembly GRCh38/hg38, is derived from >60 genomic clone libraries.[1] There are reference genomes for multiple species of viruses, bacteria, fungus, plants, and animals. Reference genomes are typically used as a guide on which new genomes are built, enabling them to be assembled much more quickly and cheaply than the initial Human Genome Project. Reference genomes can be accessed online at several locations, using dedicated browsers such as Ensembl or UCSC Genome Browser.[2]

  1. ^ "How many individuals were sequenced for the human reference genome assembly?". Genome Reference Consortium. Retrieved 7 April 2022.
  2. ^ Flicek P, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, et al. (January 2008). "Ensembl 2008". Nucleic Acids Research. 36 (Database issue): D707–D714. doi:10.1093/nar/gkm988. PMC 2238821. PMID 18000006.