Technology and resources promoted by the Human Genome Project are starting to have profound impacts on biomedical research and promise to revolutionize the wider spectrum of biological research and clinical medicine. Increasingly detailed genome maps have aided researchers seeking genes associated with dozens of genetic conditions, including myotonic dystrophy, fragile X syndrome, neurofibromatosis types 1 and 2, inherited colon cancer, Alzheimer's disease, and familial breast cancer.
On the horizon is a new era of molecular medicine characterized less by treating symptoms and more by looking to the most fundamental causes of disease. Rapid and more specific diagnostic tests will make possible earlier treatment of countless maladies. Medical researchers also will be able to devise novel therapeutic regimens based on new classes of drugs, immunotherapy techniques, avoidance of environmental conditions that may trigger disease, and possible augmentation or even replacement of defective genes through gene therapy.
Microbial Genomics
- new energy sources (biofuels)
- environmental monitoring to detect pollutants
- protection from biological and chemical warfare
- safe, efficient toxic waste cleanup
- understanding disease vulnerabilities and revealing drug targets
In 1994, taking advantage of new capabilities developed by the genome project, DOE initiated the Microbial Genome Program to sequence the genomes of bacteria useful in energy production, environmental remediation, toxic waste reduction, and industrial processing.
Despite our reliance on the inhabitants of the microbial world, we know little of their number or their nature: estimates are that less than 0.01% of all microbes have been cultivated and characterized. Programs like the DOE Microbial Genome Program help lay a foundation for knowledge that will ultimately benefit human health and the environment. The economy will benefit from further industrial applications of microbial capabilities.
Information gleaned from the characterization of complete genomes in MGP will lead to insights into the development of such new energy-related biotechnologies as photosynthetic systems, microbial systems that function in extreme environments, and organisms that can metabolize readily available renewable resources and waste material with equal facility. Expected benefits also include development of diverse new products, processes, and test methods that will open the door to a cleaner environment. Biomanufacturing will use nontoxic chemicals and enzymes to reduce the cost and improve the efficiency of industrial processes. Already, microbial enzymes are being used to bleach paper pulp, stone wash denim, remove lipstick from glassware, break down starch in brewing, and coagulate milk protein for cheese production. In the health arena, microbial sequences may help researchers find new human genes and shed light on the disease-producing properties of pathogens.
Microbial genomics will also help pharmaceutical researchers gain a better understanding of how pathogenic microbes cause disease. Sequencing these microbes will help reveal vulnerabilities and identify new drug targets.
Gaining a deeper understanding of the microbial world also will provide insights into the strategies and limits of life on this planet. Data generated in this young program already have helped scientists identify the minimum number of genes necessary for life and confirm the existence of a third major kingdom of life. Additionally, the new genetic techniques now allow us to establish more precisely the diversity of microorganisms and identify those critical to maintaining or restoring the function and integrity of large and small ecosystems; this knowledge also can be useful in monitoring and predicting environmental change. Finally, studies on microbial communities provide models for understanding biological interactions and evolutionary history.
Risk Assessment
- assess health damage and risks caused by radiation exposure, including low-dose exposures
- assess health damage and risks caused by exposure to mutagenic chemicals and cancer-causing toxins
- reduce the likelihood of heritable mutations
Understanding the human genome will have an enormous impact on the ability to assess risks posed to individuals by exposure to toxic agents. Scientists know that genetic differences make some people more susceptible and others more resistant to such agents. Far more work must be done to determine the genetic basis of such variability. This knowledge will directly address DOE's long-term mission to understand the effects of low-level exposures to radiation and other energy-related agents, especially in terms of cancer risk.
Bioarchaeology, Anthropology, Evolution, and Human Migration
- study evolution through germline mutations in lineages
- study migration of different population groups based on female genetic inheritance
- study mutations on the Y chromosome to trace lineage and migration of males
- compare breakpoints in the evolution of mutations with ages of populations and historical events
Understanding genomics will help us understand human evolution and the common biology we share with all of life. Comparative genomics between humans and other organisms such as mice already has led to similar genes associated with diseases and traits. Further comparative studies will help determine the yet-unknown function of thousands of other genes.
Comparing the DNA sequences of entire genomes of differerent microbes will provide new insights about relationships among the three kingdoms of life: archaebacteria, eukaryotes, and prokaryotes.
DNA Forensics (Identification)
- identify potential suspects whose DNA may match evidence left at crime scenes
- exonerate persons wrongly accused of crimes
- identify crime and catastrophe victims
- establish paternity and other family relationships
- identify endangered and protected species as an aid to wildlife officials (could be used for prosecuting poachers)
- detect bacteria and other organisms that may pollute air, water, soil, and food
- match organ donors with recipients in transplant programs
- determine pedigree for seed or livestock breeds
- authenticate consumables such as caviar and wine
Any type of organism can be identified by examination of DNA sequences unique to that species. Identifying individuals is less precise at this time, although when DNA sequencing technologies progress further, direct characterization of very large DNA segments, and possibly even whole genomes, will become feasible and practical and will allow precise individual identification.
To identify individuals, forensic scientists scan about 10 DNA regions that vary from person to person and use the data to create a DNA profile of that individual (sometimes called a DNA fingerprint). There is an extremely small chance that another person has the same DNA profile for a particular set of regions.
Agriculture, Livestock Breeding, and Bioprocessing
- disease-, insect-, and drought-resistant crops
- healthier, more productive, disease-resistant farm animals
- more nutritious produce
- biopesticides
- edible vaccines incorporated into food products
- new environmental cleanup uses for plants like tobacco
Understanding plant and animal genomes will allow us to create stronger, more disease-resistant plants and animals --reducing the costs of agriculture and providing consumers with more nutritious, pesticide-free foods. Already growers are using bioengineered seeds to grow insect- and drought-resistant crops that require little or no pesticide. Farmers have been able to increase outputs and reduce waste because their crops and herds are healthier.
Alternate uses for crops such as tobacco have been found. One researcher has genetically engineered tobacco plants in his laboratory to produce a bacterial enzyme that breaks down explosives such as TNT and dinitroglycerin. Waste that would take centuries to break down in the soil can be cleaned up by simply growing these special plants in the polluted area.
By the Numbers
- The human genome contains 3164.7 million chemical nucleotide bases (A, C, T, and G).
- The average gene consists of 3000 bases, but sizes vary greatly, with the largest known human gene being dystrophin at 2.4 million bases.
- The total number of genes is estimated at 30,000 to 35,000 much lower than previous estimates of 80,000 to 140,000 that had been based on extrapolations from gene-rich areas as opposed to a composite of gene-rich and gene-poor areas.
- Almost all (99.9%) nucleotide bases are exactly the same in all people.
- The functions are unknown for over 50% of discovered genes.
The Wheat from the Chaff
- Less than 2% of the genome codes for proteins.
- Repeated sequences that do not code for proteins ("junk DNA") make up at least 50% of the human genome.
- Repetitive sequences are thought to have no direct functions, but they shed light on chromosome structure and dynamics. Over time, these repeats reshape the genome by rearranging it, creating entirely new genes, and modifying and reshuffling existing genes.
- During the past 50 million years, a dramatic decrease seems to have occurred in the rate of accumulation of repeats in the human genome.
How It's Arranged
- The human genome's gene-dense "urban centers" are predominantly composed of the DNA building blocks G and C.
- In contrast, the gene-poor "deserts" are rich in the DNA building blocks A and T. GC- and AT-rich regions usually can be seen through a microscope as light and dark bands on chromosomes.
- Genes appear to be concentrated in random areas along the genome, with vast expanses of noncoding DNA between.
- Stretches of up to 30,000 C and G bases repeating over and over often occur adjacent to gene-rich areas, forming a barrier between the genes and the "junk DNA." These CpG islands are believed to help regulate gene activity.
- Chromosome 1 has the most genes (2968), and the Y chromosome has the fewest (231).
How the Human Compares with Other Organisms
- Unlike the human's seemingly random distribution of gene-rich areas, many other organisms' genomes are more uniform, with genes evenly spaced throughout.
- Humans have on average three times as many kinds of proteins as the fly or worm because of mRNA transcript "alternative splicing" and chemical modifications to the proteins. This process can yield different protein products from the same gene.
- Humans share most of the same protein families with worms, flies, and plants, but the number of gene family members has expanded in humans, especially in proteins involved in development and immunity.
- The human genome has a much greater portion (50%) of repeat sequences than the mustard weed (11%), the worm (7%), and the fly (3%).
- Although humans appear to have stopped accumulating repeated DNA over 50 million years ago, there seems to be no such decline in rodents. This may account for some of the fundamental differences between hominids and rodents, although gene estimates are similar in these species. Scientists have proposed many theories to explain evolutionary contrasts between humans and other organisms, including those of life span, litter sizes, inbreeding, and genetic drift.
Variations and Mutations
- Scientists have identified about 1.4 million locations where single-base DNA differences (SNPs) occur in humans. This information promises to revolutionize the processes of finding chromosomal locations for disease-associated sequences and tracing human history.
- The ratio of germline (sperm or egg cell) mutations is 2:1 in males vs females. Researchers point to several reasons for the higher mutation rate in the male germline, including the greater number of cell divisions required for sperm formation than for eggs.
Applications, Future Challenges
Deriving meaningful knowledge from the DNA sequence will define research through the coming decades to inform our understanding of biological systems. This enormous task will require the expertise and creativity of tens of thousands of scientists from varied disciplines in both the public and private sectors worldwide.
The draft sequence already is having an impact on finding genes associated with disease. Over 30 genes have been pinpointed and associated with breast cancer, muscle disease, deafness, and blindness. Additionally, finding the DNA sequences underlying such common diseases as cardiovascular disease, diabetes, arthritis, and cancers is being aided by the human variation maps (SNPs) generated in the HGP in cooperation with the private sector. These genes and SNPs provide focused targets for the development of effective new therapies.
One of the greatest impacts of having the sequence may well be in enabling an entirely new approach to biological research. In the past, researchers studied one or a few genes at a time. With whole-genome sequences and new high-throughput technologies, they can approach questions systematically and on a grand scale. They can study all the genes in a genome, for example, or all the transcripts in a particular tissue or organ or tumor, or how tens of thousands of genes and proteins work together in interconnected networks to orchestrate the chemistry of life
The completion of the human DNA sequence in the spring of 2003 will coincide with the 50th anniversary of Watson and Crick's description of the fundamental structure of DNA. The analytical power arising from the reference DNA sequences of entire genomes and other genomics resources is anticipated to jump start what has been predicted to be the "biology century".
Already revolutionizing biology, genome research provides a vital thrust to the increasing productivity and pervasiveness of the life sciences. Current and potential applications of genome research address national needs in molecular medicine, waste control and environmental cleanup, biotechnology, energy sources, and risk assessment.
In June 2000, international leaders of the Human Genome Project (HGP) confirmed that the rough draft of the human genome had been completed a year ahead of schedule. In February 2001, of Science and Nature contained the working draft sequence and analysis.
The draft sequence will provide a scaffold of sequence across about 90% of the human genome. Remaining gaps will be closed and accuracy improved to achieve a complete, high-quality DNA reference sequence by 2003.
Cosponsored by the U.S. Department of Energy (DOE) and National Institutes of Health (NIH), the project formally began in 1990 as a $3 billion, 15-year effort to find the estimated 30,000 or more human genes and determine the sequence of the 3-billion DNA basepairs. The , intended to guide research in FYs 1990-1995, was revised in 1993 due to unexpected progress, and the outlined goals through FY 1998. The (Science, 23 October 1998) was developed during a series of DOE and NIH . Some 18 countries participate in the worldwide effort, with significant contributions from the Sanger Center in the United Kingdom and research centers in Germany, France, and Japan.
Ethical, Legal, and Social Implications (ELSI)
- Analyze and address implications of identifying DNA sequence information for 0individuals, families, and communities.
- Facilitate safe and effective integration of genetic technologies.
- Facilitate education about genomics in nonclinical and research settings.
Rapid advances in genetics and applications present new and complex ethical and policy issues for individuals and society. ELSI programs that identify and address these implications have been an integral part of the US HGP since its inception. These programs have resulted in a body of work that promotes education and helps guide the conduct of genetic research and the development of related health professional and public policies.
Continuing and new challenges include safeguarding the privacy of individuals and groups who contribute samples for large-scale sequence variation studies; anticipating how resulting data may affect concepts of race and ethnicity; identifying how genetic data could potentially be used in workplaces, schools, and courts; commercial uses; and the impact of genetic advances on concepts of humanity and personal responsibility.
Human DNA Sequence
- Finish the complete human genome sequence by the end of 2003.
- Finish one-third of the human DNA sequence by the end of 2001.
- Achieve coverage of at least 90% of the genome in a working draft based on mapped clones by the end of 2001.
- Make the sequence totally and freely accessible.
Sequencing Technology
- Continue to increase the throughput and reduce the cost of current sequencing technology.
- Support research on novel technologies that can lead to significant improvements in sequencing technology.
- Develop effective methods for the advanced development and introduction of new sequencing technologies into the sequencing process.
Human Genome Sequence Variation
- Develop technologies for rapid, large-scale identification and/or scoring of single nucleotide polymorphisms and other DNA sequence variants.
- Identify common variants in the coding regions of the majority of identified genes during this five-year period.
- Create a SNP map of at least 100,000 markers.
- Develop the intellectual foundations for studies of sequence variation.
- Create public resources of DNA samples and cell lines.
Functional Genomics Technology
- Generate sets of full-length cDNA clones and sequences that represent human genes and model organisms.
- Support research on methods for studying functions of nonprotein-coding sequences.
- Develop technology for comprehensive analysis of gene expression.
- Improve methods for genome-wide mutagenesis.
- Develop technology for large-scale protein analyses.
Comparative Genomics
-
Complete the sequence of the roundworm C. elegans genome by 1998.
-
Complete the sequence of the fruitfly Drosophila genome by 2002.
- Develop an integrated physical and genetic map for the mouse, generate additional mouse cDNA resources, and complete the sequence of the mouse genome by 2008.
- Identify other useful model organisms and support appropriate genomic studies.
Ethical, Legal, and Social Issues
- Examine issues surrounding the completion of the human DNA sequence and the study of human genetic variation.
- Examine issues raised by the integration of genetic technologies and information into health care and public health activities.
- Examine issues raised by the integration of knowledge about genomics and gene-environment interactions in non-clinical settings.
- Explore how new genetic knowledge may interact with a variety of philosophical, theological, and ethical perspectives.
- Explore how racial, ethnic, and socioeconomic factors affect the use, understanding, and interpretation of genetic information; the use of genetic services; and the development of policy.
Bioinformatics and Computational Biology
- Improve content and utility of databases.
- Develop better tools for data generation, capture, and annotation.
- Develop and improve tools and databases for comprehensive functional studies.
- Develop and improve tools for representing and analyzing sequence similarity and variation.
- Create mechanisms to support effective approaches for producing robust, exportable software that can be widely shared.
Training and Manpower
- Nurture the training of scientists skilled in genomics research.
- Encourage the establishment of academic career paths for genomic scientists.
- Increase the number of scholars who are knowledgeable in both genomic and genetic sciences and in ethics, law, or the social sciences.
October 1, 1998 to September 30, 2003
What are some of the pros and cons of gene testing?
Gene testing already has dramatically improved lives. Some tests are used to clarify a diagnosis and direct a physician toward appropriate treatments, while others allow families to avoid having children with devastating diseases or identify people at high risk for conditions that may be preventable. Aggressive monitoring for and removal of colon growths in those inheriting a gene for familial adenomatous polyposis, for example, has saved many lives. On the horizon is a gene test that will provide doctors with a simple diagnostic test for a common iron-storage disease, transforming it from a usually fatal condition to a treatable one. Commercialized gene tests for adult-onset disorders such as Alzheimer's disease and some cancers are the subject of most of the debate over gene testing. These tests are targeted to healthy (presymptomatic) people who are identified as being at high risk because of a strong family medical history for the disorder. The tests give only a probability for developing the disorder. One of the most serious limitations of these susceptibility tests is the difficulty in interpreting a positive result because some people who carry a disease-associated mutation never develop the disease. Scientists believe that these mutations may work together with other, unknown mutations or with environmental factors to cause disease.
A limitation of all medical testing is the possibility for laboratory errors. These might be due to sample misidentification, contamination of the chemicals used for testing, or other factors.
Many in the medical establishment feel that uncertainties surrounding test interpretation, the current lack of available medical options for these diseases, the tests' potential for provoking anxiety, and risks for discrimination and social stigmatization could outweigh the benefits of testing.