Copy number variation (CNV) is a phenomenon in which sections of the genome are repeated and the number of repeats in the genome varies between individuals.[1] Copy number variation is a type of structural variation: specifically, it is a type of duplication or deletion event that affects a considerable number of base pairs.[2] Approximately two-thirds of the entire human genome may be composed of repeats[3] and 4.8–9.5% of the human genome can be classified as copy number variations.[4] In mammals, copy number variations play an important role in generating necessary variation in the population as well as disease phenotype.[1]
Copy number variations can be generally categorized into two main groups: short repeats and long repeats. However, there are no clear boundaries between the two groups and the classification depends on the nature of the loci of interest. Short repeats include mainly dinucleotide repeats (two repeating nucleotides e.g. A-C-A-C-A-C...) and trinucleotide repeats. Long repeats include repeats of entire genes. This classification based on size of the repeat is the most obvious type of classification as size is an important factor in examining the types of mechanisms that most likely gave rise to the repeats,[5] hence the likely effects of these repeats on phenotype.