IRES (Internal Ribosomal Entry Site) Identification in Regulation and Gene Expression.

Authors Avatar

1 TITLE:

Title: IRES (Internal Ribosomal Entry Site) Identification in Regulation and Gene Expression.

2 CONTENT:

2.1  Introduction

Molecular biology and the equipment available for research have rapidly increased the sequencing of large portions of the genomes of different species. Currently several bacterial genomes, as well as those of some simple eukaryotes (e.g., Saccharomyces cerevisiae, or baker's yeast) and more complex eukaryotes (C. elegans and Drosophila) have been sequenced in full. Various online sites and databases have aided bioinformaticians to do this [1].

The speed in which the Human Genome Project was carried out is truly amazing and can only depict on what is yet to come

Various famous sequence databases, such as GenBank and EMBL, NCBI have been growing at exponential rates. The information present is these data base is enormous. This deluge of information has necessitated the careful storage, organization and indexing of sequence information. Information science has been applied to biology to produce the field called bioinformatics.
 

2.2 LANDMARK SEQUENCES COMPLETED  [2]

  • First complete DNA genome: X174 DNA (1977) - 5386 bases
  • human mitochondrial DNA (1981) - 16,569 bases
  • tobacco chloroplast DNA (1986) - 155,844 bases
  • First complete bacterial genome (H. Influenzae)(1995) - 1.9 x 106 bases
  • Yeast genome (eukaryote at ~ 1.5 x 107) completed in 1996
  • Several archaebacteria
  • E. coli -- 4 x 106 bases [1997 & 1998]
  • Several pathogenic bacterial genomes sequenced
  • Helicobacter pyloris (ulcers)
  • Treponema pallidium (Syphilis)
  • Borrelia burgdorferi (Lyme disease)
  • Chlamydia trachomatis (trachoma - blindness)
  • Rickettsia prowazekii (epidemic typhus)
  • Mycobacterium tuberculosis (tuberculosis)
  • Nematode C. elegans ( ~ 4 x 108) - December 1998
  • Drosophila (fruit fly) (2000)
  • Human genome (rough draft completed 5/00) - 3 x 109 base

Bioinformatics is basically recording, annotation, storage, analysis, and searching/retrieval of nucleic acid sequence (genes and RNAs), protein sequence and structural information. The various data bases used to analyse this information woud be explained further. There are different ways to analyse and interpret information. [3]

  • To determine the genes in the DNA sequences of various organisms present in the Genbank.
  • Various methods are used to predict the structure and/or function of newly discovered proteins. Also methods are used to find the structure RNA sequences.
  • To find out the homologies and molecular modeling by clustering protein sequences into families of related sequences and the development of protein models.
  • To step back into the previous generation by aligning similar proteins and generating phylogenetic trees to examine evolutionary relationships. 

The process of evolution has produced DNA sequences that encode proteins with very specific functions. Prediction of the three dimensional structure of a protein is also possible by using algorithms that have been derived from our knowledge of physics, chemistry and most importantly, from the analysis of other proteins with similar amino acid sequences.

A very good explaination of this can be given by seeing the self explanatory shown below. [4]

2.3 USE OF ANALYSIS OF SEQUENCE DATA:

The main reason for analysis of the sequence data is to make predictions of the functions of newly identified genes, estimate evolutionary distance in phylogeny reconstruction. Not only this but there are also other things like to determine the active sites of enzymes, construct novel mutations and characterize alleles of genetic diseases to name just a few uses. Sequence data facilitates:

  Analysis of the organization of genes and genomes and their evolution

 Protein sequence can be predicted from DNA sequence which further facilitates possible prediction of protein properties, structure, and function (proteins rarely sequenced in entirety today)

  Identification of regulatory elements in genes or RNAs

  Identification of mutations that lead to disease, etc.

Bioinformatics is the field of science in which biology, computer science, and information technology merge into a single discipline. The ultimate goal of the field is to enable the discovery of new biological insights as well as to create a global perspective from which unifying principles in biology can be discerned.

Join now!

There are three important sub-disciplines within bioinformatics involving computational biology:

  New algorithms and statistics are created with which to assess relationships among members of large data sets is made possible;

   of various  including and analysis of this data and nucleotide and amino acid sequences, protein domains, and protein structures; and 

  Implementation and development of tools which enable efficient access and management of different types of information.

3 DATABASE DESCRIPTION:

 [5] is a search and retrieval system that integrates information from databases at NCBI. These databases include nucleotide sequences, protein sequences, macromolecular ...

This is a preview of the whole essay