Retrozymes are a family of retrotransposons first discovered in the genomes of plants[1] but now also known in genomes of animals.[2] Retrozymes contain a hammerhead ribozyme (HHR) in their sequences (and so the name retrozyme is a combination of retrotransposon and hammerhead ribozyme), although they do not possess any coding regions. Retrozymes are nonautonomous retroelements, and so borrow proteins from other elements to move into new regions of a genome. Retrozymes are actively transcribed into covalently closed circular RNAs (circRNAs or cccRNAs) and are detected in both polarities, which may indicate the use of rolling circle replication in their lifecycle.[3]
The genomic structure of a retrozyme in plants involves a central non-coding region that may stretch about 300–600nt flanked by long terminal repeats about 300–400nt containing the HHR motif. They also have two sequences (a primer binding site (PBS) complementary to the tRNA-Met sequence and a poly-purine tract (PPT)) needed to prime DNA synthesis during mobilization. The most distinguishing feature of the retrozyme compared with other elements of plant genomes are the hammerhead ribozyme. Otherwise, they resemble other known features of plant genomes such as terminal-repeat retrotransposons in miniature (TRIMs) and small LTR retrotransposons (SMARTs). The PBS, PPT, and the HHR motif are the only parts of the retrozyme sequences which shows conservation and homology.[4] Currently, it is thought retrozymes evolved from a large retrotransposon family known across many eukaryotes as the Penelope-like elements (PLEs). Retrozymes share a number of peculiar features with PLEs, including a type I HHR, occurrence as tandem copies, and co-existence in all analyzed metazoans to date.[2][4]
Retrozymes are presently known to reach sequence sizes as small as 170nt and as big as 1,116nt. Smaller retrozymes are typically found in invertebrates, such as a 300nt retrozyme in the genome of the Mediterranean mussel (Mytilus galloprovincialis). The largest known retrozyme is 1,116nt in length, discovered in the genome of a strain of Jatropha curcas.[5]
Presently, the only database for retrozymes and similar elements is ViroidDB, which currently contains sequences of 73 retrozymes taken from the National Center for Biotechnology Information nucleotide database.[6] Sequences of retrozymes in particular were initially directly and separately found and downloaded from GenBank, as retrozymes currently have no taxonomic classification.[6] Some methods have been developed to study retrozymes in the laboratory.[7]