The process of describing the structure and function of a genome
In molecular biology and genetics, DNA annotation or genome annotation is the process of describing the structure and function of the components of a genome,[2] by analyzing and interpreting them in order to extract their biological significance and understand the biological processes in which they participate.[3] Among other things, it identifies the locations of genes and all the coding regions in a genome and determines what those genes do.[4]
Annotation is performed after a genome is sequenced and assembled, and is a necessary step in genome analysis before the sequence is deposited in a database and described in a published article. Although describing individual genes and their products or functions is sufficient to consider this description as an annotation, the depth of analysis reported in literature for different genomes vary widely, with some reports including additional information that goes beyond a simple annotation.[5] Furthermore, due to the size and complexity of sequenced genomes, DNA annotation is not performed manually, but is instead automated by computational means. However, the conclusions drawn from the obtained results require manual expert analysis.[6]
DNA annotation is classified into two categories: structural annotation, which identifies and demarcates elements in a genome, and functional annotation, which assigns functions to these elements.[7] This is not the only way in which it has been categorized, as several alternatives, such as dimension-based[8] and level-based classifications,[3] have also been proposed.