Micropeptides (also referred to as microproteins) are polypeptides with a length of less than 100-150 amino acids that are encoded by short open reading frames (sORFs).[1][2][3] In this respect, they differ from many other active small polypeptides, which are produced through the posttranslational cleavage of larger polypeptides.[1][4] In terms of size, micropeptides are considerably shorter than "canonical" proteins, which have an average length of 330 and 449 amino acids in prokaryotes and eukaryotes, respectively.[5] Micropeptides are sometimes named according to their genomic location. For example, the translated product of an upstream open reading frame (uORF) might be called a uORF-encoded peptide (uPEP).[6] Micropeptides lack an N-terminal signaling sequences, suggesting that they are likely to be localized to the cytoplasm.[1] However, some micropeptides have been found in other cell compartments, as indicated by the existence of transmembrane micropeptides.[7][8] They are found in both prokaryotes and eukaryotes.[1][9][10] The sORFs from which micropeptides are translated can be encoded in 5' UTRs, small genes, or polycistronic mRNAs. Some micropeptide-coding genes were originally mis-annotated as long non-coding RNAs (lncRNAs).[11]
Given their small size, sORFs were originally overlooked. However, hundreds of thousands of putative micropeptides have been identified through various techniques in a multitude of organisms. Only a small fraction of these with coding potential have had their expression and function confirmed. Those that have been functionally characterized, in general, have roles in cell signaling, organogenesis, and cellular physiology. As more micropeptides are discovered so are more of their functions. One regulatory function is that of peptoswitches, which inhibit expression of downstream coding sequences by stalling ribosomes, through their direct or indirect activation by small molecules.[11]