TATA box

Figure 1. TATA box structural elements. The TATA box consensus sequence is TATAWAW, where W is either A or T.

In molecular biology, the TATA box (also called the Goldberg–Hogness box)[1] is a sequence of DNA found in the core promoter region of genes in archaea and eukaryotes.[2] The bacterial homolog of the TATA box is called the Pribnow box which has a shorter consensus sequence.

The TATA box is considered a non-coding DNA sequence (also known as a cis-regulatory element). It was termed the "TATA box" as it contains a consensus sequence characterized by repeating T and A base pairs.[3] How the term "box" originated is unclear. In the 1980s, while investigating nucleotide sequences in mouse genome loci, the Hogness box sequence was found and "boxed in" at the -31 position.[4] When consensus nucleotides and alternative ones were compared, homologous regions were "boxed" by the researchers.[4] The boxing in of sequences sheds light on the origin of the term "box".

The TATA box was first identified in 1978[1] as a component of eukaryotic promoters. Transcription is initiated at the TATA box in TATA-containing genes. The TATA box is the binding site of the TATA-binding protein (TBP) and other transcription factors in some eukaryotic genes. Gene transcription by RNA polymerase II depends on the regulation of the core promoter by long-range regulatory elements such as enhancers and silencers.[5] Without proper regulation of transcription, eukaryotic organisms would not be able to properly respond to their environment.

Based on the sequence and mechanism of TATA box initiation, mutations such as insertions, deletions, and point mutations to this consensus sequence can result in phenotypic changes. These phenotypic changes can then turn into a disease phenotype. Some diseases associated with mutations in the TATA box include gastric cancer, spinocerebellar ataxia, Huntington's disease, blindness, β-thalassemia, immunosuppression, Gilbert's syndrome, and HIV-1. The TATA-binding protein (TBP) could also be targeted by viruses as a means of viral transcription.[6]

  1. ^ a b Lifton RP, Goldberg ML, Karp RW, Hogness DS (1978). "The organization of the histone genes in Drosophila melanogaster: functional and evolutionary implications". Cold Spring Harbor Symposia on Quantitative Biology. 42 (2): 1047–51. doi:10.1101/sqb.1978.042.01.105. PMID 98262.
  2. ^ Smale ST, Kadonaga JT (2003). "The RNA polymerase II core promoter". Annual Review of Biochemistry. 72: 449–79. doi:10.1146/annurev.biochem.72.121801.161520. PMID 12651739.
  3. ^ Cite error: The named reference :4 was invoked but never defined (see the help page).
  4. ^ a b Ohshima Y, Okada N, Tani T, Itoh Y, Itoh M (October 1981). "Nucleotide sequences of mouse genomic loci including a gene or pseudogene for U6 (4.8S) nuclear RNA". Nucleic Acids Research. 9 (19): 5145–58. doi:10.1093/nar/9.19.5145. PMC 327505. PMID 6171774.
  5. ^ Cite error: The named reference :16 was invoked but never defined (see the help page).
  6. ^ Mainz D, Quadt I, Stranzenbach AK, Voss D, Guarino LA, Knebel-Mörsdorf D (June 2014). "Expression and nuclear localization of the TATA-box-binding protein during baculovirus infection". The Journal of General Virology. 95 (Pt 6): 1396–407. doi:10.1099/vir.0.059949-0. PMID 24676420. S2CID 33480957.