Figure 1, an equation to represent the transcription of RNA from DNA
DNA
n(nucleoside triphosphates NTP) →nucleoside monophosphates(nucleotides NMP) + inorg. phosphates Enzyme
The extension of the chain is as follows adding one nucleotide at a time to the lengthening polynucleotide chain, figure 2; the elongation of the chain nucleotide by nucleotide.
The growing chain of NMP + NTP → NMP + 1 + inorganic phosphates
Prokaryotic organisms have a single form of RNA polymerise, well described in E.coli consisting of the subunits α, β, β' and σ. The holoenzyme, the active form of the polymerase consists of α2, β, β' and σ, of these the β and β' polypeptides supply the catalytic basis and active site for transcription and the σ subunit involves a regulatory function for initiation of RNA transcription. There are several forms of the σ subunit, which can alter the holoenzyme to make variant forms denoted by their molecular weight for example σ28 and σ32, which can recognize different specific promoter sequences. The rate of initiation of transcription depends on the promoter, some can start transcription every second or so, others up to every twenty minutes.
Promoters also affect the level of gene expression along with other regulatory sequences, which can also effect on the rate of transcription such as enhancers and silencers. These varying expression levels are attributed to the sequence of the promoter and the way in which RNA polymerase binds to it. Within prokaryote promoters, conserved consensus sequences can be found in different genes within the same organism or in a few genes of related organisms. These homologous sequences are known as the TATAAT box found in bacterial promoters located 10bp upstream (this –10bp region is known as the Pribnow box) from the start site of initiation of transcription and also the TTGAGA sequence found 35bp upstream. The start of transcription in prokaryotes begins with template binding of the α subunit in the RNA polymerase to the promoter region of the gene to be transcribed. The promoter located at the 5' region upstream from the gene by about 40bp is recognised by the α subunit at which point the DNA helix is unwound locally to provide access to the large polymerase enzyme. Once bound to the promoter initiation begins, catalysed using the initial nucleoside triphosphate in the manner illustrated in figure 1 and 2, with each additional ribonucleotide joined by phosphodiester bonds. This chain elongation continues along in the 5' to 3' direction creating a temporary DNA/RNA duplex, which runs ant parallel to each other as the elongation continues under the direction of the core enzyme after the dissociation of the σ subunit. The termination signal, about 40bp in length results in the release of the transcribed RNA chain, in some cases the termination factor, rho (ρ) a large hexameric protein triggers the termination of synthesis. Related genes in prokaryotes can be polycistronic, that is several genes (the operon) are under the control of a single regulatory site, this cluster of genes often have only one termination signal and the corresponding mRNA transcript can be large. This is an efficient way of producing proteins since the products of the polycistronic transcript are usually needed at the same time.
Transcription; Eukaryotes
Eukaryotic transcription when compared with that of prokaryotes, though similar is more complex. There are three separate types of RNA polymerase, have different functions and consists of two large subunits and 10 to 15 small subunits. Summary of the three polymerases in table 1
Table 1, a summary of the differences between the RNA polymerases.
Pol Location Function Promoter
RNA Pol 1 Nucleolus synthesis of rRNA 2 critical regions, the core
Site of ribosome precursor gene element (-45 to +20) and upstream
biosynthesis control element (-156 to –107)
crucial for transcription
RNA pol II Nucleolus mRNA contains an initiator sequence,
SnRNA TATA box, upstream elements
GC or a CCAAT box
TATA-less promoters usually exist for
housekeeping genes and developmental
genes
RNA pol III Nucleoplasm 5S rRNA Promoter rRNA consists of box A (+50)
tRNA and box C (+83). The internal promoter
for tRNA also contains boxes A and C
Nucleolus; the site of ribosome biosynthesis associated with the nucleolar organiser region, itself a chromosomal region containing the genes for rRNA.
Cis-acting and trans-acting elements
Most is known about RNA pol II, which transcribes all mRNA in eukaryotes and works in conjunction which three cis-acting elements, “cis” meaning “next to” thus regards adjacent parts of the same DNA molecule. The TATA or Goldberg-Hogness box found in the promoter region –30bps upstream from the transcription initiation site, is similar in function to the prokaryotic TATAAT box. The consensus sequence of TATAAAA is commonly found in most eukaryotic genes and is thought to be non-specific, responsible for fixing the initiation site by facilitating the local denaturation of the helix, a theory supported by the weaker A=T covalent bond found between these base pairs. The CAAT box, another of the cis-acting elements found –80bps from the transcription start site contains a consensus sequence of GGCCAATCT, this like the TATA box and other regulatory regions located up stream, influences the efficiency of the promoter. Other elements that also affect transcription are trans-acting factors, these proteins in contrary to cis-acting elements are recruited in and facilitate template binding, referred to as transcription factors they are essential as RNA pol II cannot bind directly to the promoter.
Transcription Factors
RNA pol II has six associated transcription factors known as TFIIA, TFIIB, TFIID, TFIIE, TFIIF and TFIIH. These bind to the TATA box beginning with TFIID which contains a TATA binding protein (TBP) along with between 8-10 TATA box binding protein associated factors (TAF’s ‘II’). TFIID is bound to TFIIA and also to an upstream DNA-binding protein (a regulatory factor), which in turn is bound to a regulatory sequence. This results in the DNA forming a loop enclosing TFIIA and TFIID with the upstream binding protein enclosed in it and TFIIB, TFIIF and TFIIE forming a transcription initiation complex, thought to be required in order to stimulate transcription and ensure the process is initiated correctly. This complex is located on the outside of the DNA loop along with RNA Pol II, which binds after TFIID has done so with the TATA box. After initiation, the RNA Pol II moves along the DNA transcribing the gene, creating a lengthening mRNA transcript following the equations in figures 1 and 2. TFIIB and E and released leaving F bound, similarly TFIID remains bound to the TATA box, to TFIIA and the upstream binding protein so that transcription can be initiated with another RNA Pol II molecule until regulatory signals cause repression of transcription. RNA polymerase I involved in the synthesis of rRNA has two transcription factors; SL1 and UBF, upstream binding factor, these act in synergy to stimulate transcription. RNA polymerase III requires TFIIIB and C and translation of the 5SrRNA genes in addition requires TFIIIA. TFIIIA binds to the internal promoter as does A for 5SrRNA, this allows the binding of TFIIIB which helps bind RNA Pol III to the transcription initiation site. B remains bound but there is the probable release of A or C, as the polymerase transcribes the gene. SL1 and TFIIIB both contain the transcription binding protein (TBP) found in association with RNA polymerase I TFIID and therefore is a universal transcription factor.
Processing of mRNA
Once the 5’ end of the RNA precursor has been synthesised, a methylated guanosine (7mG) cap is added to the end, to prevent digestion of the end by exonucleases, it also aids the transfer of the RNA out of the nucleous and also plays an important role in the translation of the actual mRNA transcript. (Methylated – added Methyl group to bases and sugars prevents enzymatic digestion). The poly-A sequence at the 3’ end comprising of adenylic acid residues, which can stretch for up to 250 bases protects the mRNA from premature degradation by exonucleases. First the 3’ end of the transcript is cleaved enzymatically about 10 to 35 ribonucleotides from the highly conserved sequence AAUAAA, after polyadenylation occurs, adding each additional adenylic acid residue. This poly-A tail is found at the end of most mRNA transcripts in eukaryotes, both the 5’ cap and poly-A tail are critical for the RNA to be further processed and transported to the cytoplasm. Splicing of the larger hnRNA transcript then occurs to remove introns (in for intervening) and splice together the remaining exons (ex for expressed) in order to complete the mature mRNA. hnRNA and mRNA are never found free in the cell, like DNA, they are bound by cations and proteins, these complexes are termed ribonucleoproteins or RNPs. The variability in sequence and structure means that no structure has been determined for mRNA.
The mechanism which serves to splice the hnRNA together is known as a spliceosome, it is very large upto 60S in mammals and has a unique set of small nuclear RNAs (snRNAs). These only found in the nucleus are rich in uradine residues and are labelled U1 through to U6. They are often complexed with proteins to form small nuclear ribonucleoproteins called snRNP’s or snurps and each of them bears a nucleotide sequence homologous to the 5’ end of the intron. The base pairing which results between U1 – U6 and the introns, which occurs in the vast majority of eukaryotes at the G/GU located at the 5’ end of the intron, the branch point A located within the intron and the AG/G at the 3’ end of the intron forms a lariat structure, part of the spliceosome and the intron is then excised between U2 and U6, U5 is finally removed from inbetween the exons leaving mRNA.
This splicing mechanism represents a regulatory step during gene expression and examples have been found where introns in preRNA are spliced in differing ways to yield a varied collection of exons in the mRNA, this process of alternative splicing results in a group of related proteins called isoforms, variation from a single gene.