Transcription and RNA Processing

Gene transcription, RNA processing in eukaryotes, and gene expression

📝 Transcription and RNA Processing

Overview

Gene expression: DNA → RNA → Protein

Transcription: DNA → RNA (this topic) Translation: RNA → Protein (next topic)

Transcription Process

Purpose: Synthesize RNA from DNA template

Key enzyme: RNA polymerase

  • Does NOT need primer (unlike DNA polymerase)
  • Synthesizes RNA 5'→3' direction
  • Reads template strand 3'→5'

Three Stages

1. Initiation

  • Promoter: DNA sequence where RNA polymerase binds
  • TATA box: common promoter element in eukaryotes (~25 bp upstream)
  • Transcription factors help RNA polymerase bind (eukaryotes)
  • RNA polymerase unwinds DNA

2. Elongation

  • RNA polymerase moves along DNA (3'→5' on template)
  • Adds RNA nucleotides (5'→3')
  • Coding strand (non-template) has same sequence as RNA (except T→U)
  • Template strand (antisense) used to make RNA

3. Termination

  • Prokaryotes: terminator sequence causes hairpin loop
  • Eukaryotes: cleavage signal, polyadenylation signal
  • RNA polymerase releases
  • RNA transcript complete

Prokaryotic vs. Eukaryotic Transcription

| Feature | Prokaryotes | Eukaryotes | |---------|-------------|------------| | RNA polymerase | One type | Three types (I, II, III) | | Promoter | -10, -35 boxes | TATA box, others | | Processing | None | Extensive | | Location | Cytoplasm | Nucleus | | Coupling | Transcription + translation | Separated |

RNA Processing (Eukaryotes Only)

Primary transcript (pre-mRNA) must be processed before translation

1. 5' Cap

  • 7-methylguanosine cap added to 5' end
  • Functions:
    • Protects from degradation
    • Helps ribosome recognize mRNA
    • Aids in export from nucleus

2. 3' Poly-A Tail

  • ~50-250 adenine nucleotides added to 3' end
  • Functions:
    • Protects from degradation
    • Aids in export from nucleus
    • Helps ribosome locate start codon

3. RNA Splicing

  • Introns (non-coding) removed
  • Exons (coding) joined together
  • Carried out by spliceosome (snRNPs + proteins)

Alternative splicing:

  • Different combinations of exons
  • One gene → multiple proteins
  • Increases protein diversity
  • ~95% of human genes alternatively spliced

Gene Structure (Eukaryotes)

Gene organization:

  • Promoter
  • 5' UTR (untranslated region)
  • Exons (expressed sequences)
  • Introns (intervening sequences)
  • 3' UTR
  • Terminator

Types of RNA

1. mRNA (messenger RNA)

  • Carries genetic information DNA → ribosome
  • Translated into protein
  • ~5% of total RNA

2. rRNA (ribosomal RNA)

  • Structural and catalytic component of ribosomes
  • Most abundant RNA (~80%)

3. tRNA (transfer RNA)

  • Brings amino acids to ribosome
  • Has anticodon that pairs with mRNA codon
  • ~15% of total RNA

4. Other RNAs

  • snRNA: splicing (in snRNPs)
  • miRNA: gene regulation (microRNA)
  • siRNA: gene silencing (small interfering RNA)

Key Concepts

  1. RNA polymerase synthesizes RNA 5'→3', reads DNA 3'→5'
  2. Promoter is where transcription starts
  3. Template strand is copied; coding strand has same sequence as RNA
  4. Eukaryotic processing: 5' cap, poly-A tail, splicing
  5. Introns removed, exons joined
  6. Alternative splicing increases protein diversity
  7. Three main RNAs: mRNA (message), tRNA (transfer), rRNA (ribosomal)

📚 Practice Problems

1Problem 1medium

Question:

A gene has the following DNA template strand: 3'-TACGCAATGCGA-5'. (a) Write the mRNA sequence transcribed from this template, (b) identify the start and stop codons, and (c) write the amino acid sequence that would be translated (use the genetic code).

💡 Show Solution

Given: Template strand: 3'-TACGCAATGCGA-5'

(a) mRNA sequence:

Transcription rules:

  • RNA polymerase reads template 3' → 5'
  • Synthesizes mRNA 5' → 3' (antiparallel)
  • Uses complementary base pairing:
    • A (DNA) → U (RNA)
    • T (DNA) → A (RNA)
    • G (DNA) → C (RNA)
    • C (DNA) → G (RNA)

Step-by-step:

Template (3'→5'):  3'- T A C G C A A T G C G A -5'
                        ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
mRNA (5'→3'):      5'- A U G C G U U A C G C U -3'

mRNA: 5’-AUGCGUUACGCU-3\boxed{\text{mRNA: } 5\text{'-AUGCGUUACGCU-}3'}

(b) Start and stop codons:

Start codon: AUG

  • Position: First codon (nucleotides 1-3)
  • Codes for: Methionine (Met)
  • Signals: Translation start
  • All proteins begin with Met (often removed later)

Looking for stop codons:

  • UAA, UAG, UGA = stop codons
  • Check the sequence: AUG CGU UAC GCU
  • No stop codon present in this sequence!

Note: This appears to be partial gene sequence. A real gene would have:

  • Promoter (before start)
  • Start codon (AUG) ✓
  • Coding sequence
  • Stop codon (UAA, UAG, or UGA)
  • Terminator

Start: AUG (position 1-3); Stop: none in this sequence\boxed{\text{Start: AUG (position 1-3); Stop: none in this sequence}}

(c) Amino acid sequence:

Translation using genetic code:

Divide mRNA into codons (3-nucleotide groups):

mRNA:    5'- AUG  CGU  UAC  GCU -3'
Codons:      ↓    ↓    ↓    ↓

Using genetic code table:

| Codon | Amino Acid | Abbreviation | |-------|------------|--------------| | AUG | Methionine | Met (M) | | CGU | Arginine | Arg (R) | | UAC | Tyrosine | Tyr (Y) | | GCU | Alanine | Ala (A) |

Polypeptide:

Met-Arg-Tyr-Ala\boxed{\text{Met-Arg-Tyr-Ala}}

Or using single-letter code: MRYA

Complete picture:

DNA coding strand:    5'-ATGCGTTACGCT-3' (not given, but complementary to template)
DNA template strand:  3'-TACGCAATGCGA-5' (given)
                           ↓ Transcription
mRNA:                 5'-AUGCGUUACGCU-3'
                           ↓ Translation
Polypeptide:          Met-Arg-Tyr-Ala (N-terminus → C-terminus)

Key Concepts:

Genetic Code Properties:

  1. Triplet code: 3 nucleotides = 1 amino acid
  2. Degenerate: Multiple codons for same amino acid
    • CGU, CGC, CGA, CGG all code for Arg
  3. Universal: Same code in nearly all organisms
  4. Unambiguous: Each codon specifies only ONE amino acid
  5. Non-overlapping: Codons read in sequence, no overlap

Reading frame:

  • AUG sets the reading frame
  • Must read in correct groups of 3
  • Frameshift mutation → wrong amino acids!

Example if we shift by +1:

  • Normal: AUG CGU UAC GCU
  • +1 shift: A UGC GUU ACG CU → different amino acids!

Why AUG is special:

  • Only start codon (in eukaryotes)
  • Also codes for Met in middle of protein
  • Context determines if it's start or internal Met