11.1: Transcription
- Page ID
- 38250
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Unlike DNA synthesis, which only occurs during the S phase of the cell cycle, transcription and translation are continuous processes within the cell. The 5ʼ to 3ʼ strand of a DNA sequence functions as the coding (nontemplate) strand for the process of transcription such that the transcribed product will be identical to the coding strand, except for the insertion of uracil for thymidine (figure 11.1). The transcribed mRNA will serve as the template for protein translation.
Gene structure
The chromosome is organized into functional units call genes. These are specific locations on a chromosome that are composed of a transcribed region and a regulatory (or promoter) region. The transcribed region is typically (but not always) downstream of the transcriptional start and contains the following DNA elements: a 5ʼ cap site (required for maturation of mRNA), translational start (AUG), introns and exons, and the polyadenylation site (figure 11.2).
The regulatory or promoter region is upstream of the transcriptional start and contains regulatory elements such as:
- TATA box, which provides an accessible region for the DNA to begin to unwind, allowing for access by the transcriptional machinery, and
- CAAT or GC box and enhancers or repressors (for eukaryotic transcription), which help modulate the amount of transcript produced in any given cell.
In eukaryotes, a single gene will produce one gene product as all genes are regulated independently. This is in contrast to prokaryotes, which regulate genes in an operon structure where one mRNA may be polycistronic and encode for multiple protein products.
Types of RNA polymerase
RNA polymerase I is located in the nucleolus, a specialized nuclear substructure in which ribosomal RNA (rRNA) is transcribed, processed, and assembled into ribosomes. RNA polymerase I synthesizes all the rRNAs from the tandemly duplicated set of 18S, 5.8S, and 28S ribosomal genes. (Note that the “S” designation applies to “Svedberg” units, a nonadditive value that characterizes the speed at which a particle sediments during centrifugation.)
RNA polymerase II is located in the nucleus and synthesizes all protein-coding nuclear pre-mRNAs. Eukaryotic pre-mRNAs undergo extensive processing after transcription but before translation.
RNA polymerase II is responsible for transcribing the overwhelming majority of eukaryotic genes. RNA polymerase III is also located in the nucleus. This polymerase transcribes a variety of structural RNAs that includes the 5S pre-rRNA, transfer pre-RNAs (pre-tRNAs), and small nuclear pre-RNAs. The tRNAs have a critical role in translation; they serve as the “adaptor molecules” between the mRNA template and the growing polypeptide chain. Small nuclear RNAs have a variety of functions, including “splicing” pre-mRNAs and regulating transcription factors.
Locations, products, and sensitivities of the three eukaryotic RNA polymerases
RNA polymerase | Cellular compartment | Product of transcription | \(\alpha\)-Amanitin sensitivity |
---|---|---|---|
I | Nucleolus | All rRNAs except 5S rRNA | Insensitive |
II | Nucleus | All protein-coding nuclear pre-mRNAs | Extremely sensitive |
III | Nucleus | 5S rRNA, tRNAs, and small nuclear RNAs | Moderately sensitive |
Table 11.1: Locations, products, and sensitivities of the three eukaryotic RNA polymerases.
Transcription
Initiation
Eukaryotes assemble a complex of transcription factors required to recruit RNA polymerase II to a protein coding gene.
Transcription factors that bind to the promoter are called basal transcription factors. These basal factors are all called TFII (for transcription factor/polymerase II) plus an additional letter (A–J). The core complex is TFIID, which includes a TATA-binding protein (TBP). The other transcription factors systematically fall into place on the DNA template, with each one further stabilizing the pre-initiation complex and contributing to the recruitment of RNA polymerase II (figure 11.3).
Some eukaryotic promoters also have a conserved CAAT box (GGCCAATCT) at approximately -80. Further upstream of the TATA box, eukaryotic promoters may also contain one or more GC-rich boxes (GGCG) or octamer boxes (ATTTGCAT). These elements bind cellular factors that increase the efficiency of transcription initiation and are often identified in more “active” genes that are constantly being expressed by the cell. Other regulatory elements within the promoter region will be discussed in section 12.1.
Elongation
Following the formation of the pre-initiation complex, the polymerase is released from the other transcription factors, and elongation is allowed to proceed with the polymerase synthesizing pre-mRNA in the 5′ to 3′ direction.
Termination
The termination of transcription is different for the different polymerases. Unlike in prokaryotes, elongation by RNA polymerase II in eukaryotes takes place 1,000 to 2,000 nucleotides beyond the end of the gene being transcribed. This pre-mRNA tail is subsequently removed by cleavage during mRNA processing. Alternatively, RNA polymerases I and III require termination signals. Genes transcribed by RNA polymerase I contain a specific eighteen-nucleotide sequence that is recognized by a termination protein. The process of termination in RNA polymerase III involves an mRNA hairpin similar to rho-independent termination of transcription in prokaryotes.
Types of RNA
RNA is found in three different forms in the cell, and each is used for specific aspects of translation. Not all RNA that is transcribed is translated into a protein product; some transcribed RNA (rRNA and tRNA) is fully functional in the RNA form. mRNA (messenger RNA) is transcribed by RNA pol II.
mRNA
In eukaryotes, pre-mRNA requires maturation before use in translation including (figure 11.4):
- 5ʼ Capping by the addition of a 7-methylguanosine cap. Capping, resulting in the addition of two methyl groups on the 5ʼ end, is fundamental for both mRNA stabilization and for translational initiation.
- Addition of a poly(A) tail. The addition of the poly(A) tail also provides mRNA stability and is important for transcriptional termination. Neither the cap nor tail are part of the DNA coding regions.
- Splicing. Splicing involves removal of introns (noncoding regions) and retention of exons (coding regions).
Splicing is a complex process mediated by a large protein RNA-associated complex called the spliceosome. The structure contains both proteins and small nuclear (sn)RNA. (Note antibodies to snRNAs are specific for systemic lupus.) Intronic sequences usually have GU at their 5′ end and AG at their 3′ end. An adenosine (A) is typically found at the branching point within the intron sequence. Small nuclear ribonucleoproteins (snRNPs) of the spliceosome recognize intron‒exon junctions and splice out the intron as a “lariat” structure. Splicing starts with an autocatalytic cleavage of the 5ʼ end of the intron leading to the formation of a circular or lariat where a 5' UG sequence pairs with an internal adenine (A) or branch site. Finally the 3ʼ end of the intron is cleaved, and the intron is released as a lariat, and the right side of the exon is spliced to the left side. Alternative splicing of introns and exons generates protein variation from a single mRNA (figure 11.5).
tRNA
tRNA, transfer RNA, is transcribed by RNA pol III, and like mRNA it requires maturation including:
- Removal of introns,
- The addition of the 3ʼ amino acid attachment site (CCA), and
- Folding into a clover like structure.
tRNAs also are typical of base modifications generating nonconventional bases allowing base-pairing to several codons. This duplicity of binding is usually due to wobble in the third base pair. tRNA primarily functions to bring amino acids to the ribosome during protein translation. The anticodon on tRNA pairs with the codon on mRNA, and this determines which amino acid is added to the growing polypeptide chain.
rRNA
rRNA, ribosomal RNA, is transcribed by RNA poly I and III and requires maturation that is slightly different from mRNA and tRNA. This RNA product is not translated but rather requires methylation and is incorporated into the protein as structural support. The 18S RNA is incorporated into the 40S ribosomal subunit, and the 28S, 5.8S, and 5S is incorporated into the 60S ribosomal subunit. These combine to make the full 80S ribosome required for protein translation.
References and resources
Text
Clark, M. A. Biology, 2nd ed. Houston, TX: OpenStax College, Rice University, 2018, Chapter 15: Genes and Proteins.
Karp, G., and J. G. Patton. Cell and Molecular Biology: Concepts and Experiments, 7th ed. Hoboken, NJ: John Wiley, 2013, Chapter 11: Gene Expression: From Transcription to Translation.
Le, T., and V. Bhushan. First Aid for the USMLE Step 1, 29th ed. New York: McGraw Hill Education, 2018, 39, 41–45.
Nussbaum, R. L., R. R. McInnes, H. F. Willard, A. Hamosh, and M. W. Thompson. Thompson & Thompson Genetics in Medicine, 8th ed. Philadelphia: Saunders/Elsevier, 2016, Chapter 3: The Human Genome: Gene Structure and Function.
Figures
Grey, Kindred, Figure 11.3 Transcription initiation. 2021. https://archive.org/details/11.3_20210926. CC BY 4.0.
Grey, Kindred, Figure 11.4 Overview of mRNA processing involving the removal of introns (splicing), addition of a 5’ cap and 3’ tail. 2021. https://archive.org/details/11.4_20210926. CC BY 4.0.
Grey, Kindred, Figure 11.5 Summary of mRNA splicing. 2021. https://archive.org/details/11.5_20210926. CC BY 4.0.
Lieberman M, Peet A. Figure 11.1 Co-linearity of DNA and RNA. Adapted under Fair Use from Marks' Basic Medical Biochemistry. 5th Ed. pp 277. Figure 15.3 Reading frame of messenger RNA (mRNA). 2017.
Lieberman M, Peet A. Figure 11.2 Schematic view of a eukaryotic gene structure. Adapted under Fair Use from Marks' Basic Medical Biochemistry. 5th Ed. pp 255. Figure 14.4 A schematic view of a eukarytoic gene, and steps required to produce a protein product. 2017. Added Myoglobin by AzaToth. Public domain. From Wikimedia Commons.