101,99 €
Proteomics provides an introductory insight on proteomics, discussing the basic principles of the field, how to apply specific technologies and instrumentation, and example applications in human health and diseases. With helpful study questions, this textbook presents an easy to grasp and solid overview and understanding of the principles, guidelines, and especially the complex instrumentation operations in proteomics for new students and research scientists. Written by a leader in proteomics studies, Proteomics offers an expert perspective on the field and the future of proteomics.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 339
Veröffentlichungsjahr: 2011
Table of Contents
Title Page
Copyright
Dedication
Foreword
Preface
About the Author
1 Historical Perspectives
1.1 Introduction to Proteomics
1.2 Proteome and Proteomics
1.3 Genetics of Proteins
1.4 Molecular Biology of Genes and Proteins
1.5 Protein Chemistry Before Proteomics
2 Proteomics—relation to Genomics, Bioinformatics
2.1 Genomics
2.2 Bioinformatics and Computational Biology
3 Methodology for Separation and Identification of Proteins and their Interactions
3.1 Separation of Proteins Via a Multidimensional Approach
3.2 Determination of the Primary Structure of Proteins
3.3 Determination of the 3D Structure of a Protein
3.4 Determination of the Amount of Proteins
3.5 Structural and Functional Proteomics
4 Proteomics of Protein Modifications
4.1 Phosphorylation and Phosphoproteomics
4.2 Glycosylation and Glycoproteomics
4.3 Ubiquitination and Ubiquitinomics
4.4 Miscellaneous Modifications of Proteins
5 Proteomics of Protein–Protein Interactions/Interactomes
5.1 Protein—Protein Interactions (PPI) in Vivo
5.2 Analysis of Protein Interactions in Vitro
5.3 Analysis of Protein Interactions in Silico
5.4 Synthetic Genetic Methods to Determine Protein Interactions
5.5 Interactomes
5.6 Evolution and Conservation of Interactomes
5.7 Interactomes and the Complexity of Organisms: It is the Number of Interactomes that Matters in Understanding the Complexity of an Organism and not the Number of Genes
5.8 Interaction of Proteins with Small Molecules
6 Applications of Proteomics I: Proteomics, Human Disease, and Medicine
6.1 Diseasome
6.2 Medical Proteomics
6.3 Clinical Proteomics
6.4 Metaproteomics and Human Health
6.5 Proteomics in Biotechnology and Industry of Drug Production
6.6 Metaproteomics of Microbial Fermentation
6.7 Beef Industry
6.8 Bioterrorism and Biodefense
7 Proteomics—Future Developments
7.1 Technical Scope of Proteomics—Beyond Protein Identification
7.2 Scientific Scope of Proteomics—Control of Epigenesis
7.3 Medical Scope of Proteomics
7.4 Proteomics, Energy Production, and Bioremediation
7.5 Proteomics and Biodefense
Index
Cover: Proteomics of Metamorphosis in Insect - A Computer Projection.
Copyright © 2010 by John Wiley & Sons, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data
Mishra, N. C. (Nawin C.)
Introduction to proteomics : principles and applications / Nawin Mishra.
p. ; cm.—(Methods of biochemical analysis ; 146)
Includes bibliographical references and index.
ISBN 978-0-471-75402-2 (cloth)
1. Proteomics—Textbooks. I. Title. II. Series: Methods of biochemical analysis ; v. 146.
[DNLM: 1. Proteomics. 2. Proteome—analysis. W1 ME9617 v. 146 2010 / QU 58.5M678i 2010]
QP551.M475 2010
572′.6—dc22
2009049260
This book is dedicated to the memory of Professer E. L. Tatum
and my parents, the mentors in my life, and to Purnima and Prakash.
Foreword
Proteomics provides a better understanding of cells by elucidating the structure, function, and interactions of proteins. The one gene–one enzyme concept of Beadle and Tatum provided an important tool necessary for the analysis of proteins by creating a mutant protein and then comparing its properties with that of the wild-type protein. This method of Beadle and Tatum and the method of Edman degradation have become standard tools for deciphering the structure and function of proteins until the coming of genomics and the high-throughput methods of mass spectrometry and bioinformatics. In this context, the book on Introduction to Proteomics by Nawin Mishra, who was an associate of Tatum at a time when the structure and function of proteins were being elucidated in laboratories around the world, is important. This book deals with all the basic and medical aspects of proteomics, including personalized medicine. This book could serve as a valuable reference for all those interested in proteomics.
Günter Blobel
Laboratory of Cell Biology
The Howard Hughes Medical Institute
The Rockefeller University
1230 York Avenue
New York, NY 10065-6399
Preface
Proteomics is the study of all the proteins of a cell or an organism. It is the newly developed science for the study of proteins. It attempts to define the proteome, which is the entire protein content of an organism encoded by its genome; hence, the word is derived from protein and genome. Proteomics aims at describing the structure and function of the proteins of a cell at a large scale. This enables us to understand the structure and function of a cell and finally that of an organism. The science of proteomics has obvious applications to medicine through identification of proteins as marker(s) of a disease (i.e., diagnostics) or as targets of new drugs or as therapeutics (i.e., drugs) as well. Proteomics provides new tools for the understanding of proteins, which are the workhorse molecules of a cell that control all its biophysical and biochemical attributes. The one gene–one enzyme concept of Beadle and Tatum (1941) provided a unique tool for the study of proteins; this approach is being used every day, even to this date. Proteomics based on high-throughput technologies added a new dimension to the approach initiated by Beadle and Tatum. This book, therefore, examines proteomics beyond the one gene–one enzyme concept.
My research interest in genetics and the biochemistry of proteins goes back to the mid-1960s, when I began my association with the late Nobel Laureate Professor Edward L. Tatum at the Rockefeller University as a postdoctoral fellow supported by the Jane Coffin Childs funds for Medical Research. Beadle and Tatum together formulated the one-gene–one enzyme concept in 1941. George Beadle, Edward L. Tatum, and Joshua Lederberg shared the 1958 Nobel Prize in Physiology and Medicine for their respective contributions to the development of the one-gene–one-enzyme concept in Neurospora and recombination in bacteria; Lederberg later became president of Rockefeller University. This theory of Beadle and Tatum established the conceptual scheme for the control of the structure and function of a protein by a gene.
At Rockefeller University, the laboratories of William Stein and Stanford Moore and that of Robert Bruce Merrifield were situated close to Tatum's laboratory. In their laboratories, the first large protein was sequenced and chemically synthesized. I remember having several discussions with these scientists about the structure and function of proteins. William Stein, Stanford Moore, and Gerry Edelman, all of whom were from Rockefeller University, and Christian Anfinsen of the National Institutes of Health (NIH) became Nobel Laureates in 1972. Later, Bruce Merrifield in 1984 and Günter Blobel in 1999, also from Rockefeller University, received Nobel Prizes, all of them for their contributions to protein chemistry, including the structure, function, synthesis, and intracellular transport of proteins. The goal of Stein and Moore at that time was to sequence more than 1000 proteins by the end of the 20th century. This goal was realized much faster with the science of genomics and with the application of mass spectrometry and other high-throughput technologies.
At Rockefeller University, I also had the opportunity to know Professor Frank H. Field, director of the mass spectrometry laboratory. Earlier, Dr. Field, in collaboration with Joe Franklin, had developed the first ionization technique for mass spectrometry. Dr. Field was helping Professor Tatum with the identification of chemical(s) emitted into the gas phase by a slow-growing morphological mutant of Neurospora. An exposure of this gaseous emission to the wild-type strain made it grow slowly like the mutant. This chemical, however, remained elusive to identification by mass spectrometry.
Soon after my arrival at Rockefeller University, I remember having a discussion with the Professor Victor Najjar on the one-gene–one-enzyme theory. Dr. Najjar, then a Professor at the Vanderbilt University and an editor of Methods in Enzymology, was visiting Rockefeller University on a sabbatical leave. During a discussion of my work with him, he became somewhat concerned after learning about the possible role of two genes in the control of an enzyme, phosphoglucomutase, involved in the morphogenesis of a fungus Neurospora as my work indicated at that time. I believe this was perhaps because of his unfamiliarity with the literature in genetics and particularly that of the role of suppressor genes in controlling the structure of a protein encoded by another gene. He, therefore, thought that my findings were in contradiction to the original idea of the one-gene–one-enzyme hypothesis. However, I convinced Dr. Najjar that such findings make a difference only in semantics and not in the conceptual scheme of the original one-gene–one-enzyme theory. I pointed out to him that these exceptions only strengthen the original one-gene–one-enzyme concept, just as certain observations such as the partial dominance, co-dominance, and epistasis, which on the surface seem to be in conflict with Mendelian rules of inheritance, actually lend support to the original ideas implicit in the rules of inheritance by Mendel.
Later that day, I discussed with Professor Tatum the exchange on the one-gene–one-enzyme theory during my conversation with Dr. Najjar. During our conversation, Professor Tatum immediately pointed out that the one-gene–one-enzyme hypothesis has already been modified to a one-cistron (gene)–one-polypeptide hypothesis: However, I was aware of this concept and told professor Tatum that I had already pointed out this modification to Professor Najjar. Professor Tatum also expressed that he expected additional modification to this theory because of the looming complexity of our genetic material as was being revealed by the nucleic acid hybridization experiments. He expressed to me that it was indeed a matter of semantics and that so long we understood what we were talking about, we lived with the limits of the conceptual scheme of the one-gene–one-enzyme hypothesis. Almost a decade later, Phillip Sharp from the Massachusetts Institute of Technology (MIT) revealed the split nature of the gene and received the Nobel Prize in 1990 for his work. Furthermore, the study of the structure of the immunoglobulin gene(s), which brought the Nobel Prize to Tonegawa, also from MIT in 1987, presented an extreme view of an exception to the one-gene–one-enzyme hyothesis. However, these findings affirmed the expectations of Professor Tatum that the one-gene–one-enzyme theory would be modified in view of the complexity of our genetic material. Despite the changes to this theory, it is important to note that almost all genes in prokaryotes and more than 50% of genes in higher eukaryotes obey the dictum of one-gene–one-enzyme theory. This theory still provides the basis for creation of mutants and knockouts crucial for the study of a protein structure and function and its role in controlling the phenotype of the organism. This theory is also the basis for the gene therapy approach for the treatment of human diseases.
I remember the events and the manner in which the field of protein chemistry progressed and then was later ignored with the coming of the genome projects and the science of genomics; it was finally revived and blossomed into the science of proteomics. The coming of genomics and the subsequent development of proteomics have completely changed our view regarding the philosophy of science and how we understand biology. Before genomics, we had a reductionist view of science, and the biology of an organism was thought to be understood in terms of the molecules only. We also used to do one thing at a time when deciphering one molecule after another. Now, we are trying to understand all things at the same time because of our ability for high-throughput analyses; we are no longer reductionists, rather we are holists trying to understand the biology in terms of the interactions of a large number of molecules at once. The science of proteomics has thus ushered in the coming of a new branch of science called systems biology to obtain the ultimate understanding of an organism within a particular environment. An understanding of the environment is important because it can bring about changes in the structure and function of genes and gene products.
I write this book on the science of proteomics with the goal of bringing out its conceptual development starting from one-gene–one-enzyme theory and leading to its instrumentation-based methodologies and applications in medicine and biotechnology and the fact that life is sustained by the interactions of proteins. I take special effort in describing the nature and operation of these complex instrumentations involved in proteomics in a language readily understandable to students with an exclusive background in biology. I also provide an emphasis on biological methods in elucidating certain aspects of proteomics, which has been ignored in earlier treatises on the subject of proteomics. This book is written in a manner comprehensible to emerging scientists, including undergraduate and graduate students as well as postdoctoral trainees.
The book is organized into seven chapters, and many references, although some included at the end of the chapters, are not cited in the text to allow for the smooth flow of main concepts and easy reading of the subject matter. I hope that my efforts are successful.
I believe no such text that particularly addresses the needs of the biologist exists at this time. In this book, an attempt is made to give a biologist's view of the subject to non–biologists equally well, particularly bringing to their attention how biologists approached certain problems—for example, protein–protein interactions in the absence of advanced technologies such as bioinformatics. I also believe that this text is a contribution to this emerging branch of science of proteomics and to systems biology, and of course to scientists in these branches of science, leading to the appreciation of the developments in proteomics beyond the one-gene–one-enzyme concept of Beadle and Tatum that provided the conceptual scheme and the tool for understanding proteins in the living system.
This book is being published on the occasion of the 52nd anniversary of the awarding of the Nobel Prize to Beadle and Tatum in 1958 to reflect the progress made in the understanding of proteins, which was started by the conceptualization of the one-gene–one-enzyme hypothesis that provided the tool for analysis of proteins.
I would like to thank many colleagues for their help with this work. I would like to thank Professors Steve Threlkeld and J.J. Miller, both of McMaster University, for my fueling initial interest in genetics and Professor Stuart Brody of the University of California, San Diego, (formerly at the Rockefeller University) for my introduction to enzymology. In addition, I am grateful to Professor Philip Hanawalt of Stanford University and Professor Stuart Linn of the University of California, Berkeley for their support of my continued interest in the genetical biochemistry of proteins. I would also like to thank Professor David Reisman at the University of South Carolina for reading the manuscript in its entirety and for his many helpful comments. I am also thankful to Professors Michael Felder and Sanjib Mishra both at the University of South Carolina, Professor Narsingh Deo of the University of Central Florida, Professor David Gangemi of Clemson University, Professor Alexandru Almasan of the Cleveland Clinic, Dr. Narendra Singh of the U.S.C. Medical School, Professor R.P. Jha of Patna University, Professor K.M. Marimuthu of the Post Graduate School at Madras University, Professor Ramesh Maheshwari of the Indian Institute of Science, Prashant Jha and Dr. Kanchan Kumari for their support of my endeavors and to Dr. Richard Vogt of the University of South Carolina for help with the cover picture.
This work would not have been possible without the encouragement and show of infinite patience from Dr. Darla Henderson of John Wiley and Sons, particularly during periods of multiple personal challenges. I also thank Anita Lekhwani, the Senior Acquisition Editor of John Wiley and Sons, for her immense interest in this work and for her enthusiastic support and assistance that eased the submission of this manuscript and made its publication possible. I am also thankful to Christine Moore, Rebekah Amos, Sheree Van Vreede, and Kellsee Chu of John Wiley & Sons for assistance with the manuscript that helped its timely publication. I am grateful to Dr. Kevin H. Lee of the University of Delaware for the two-dimensional gel picture, Darryl Leza of NHGRI, NIH, for the protein structure picture, and to John Alam, Clint Cook and Michelle J. Bridge of the Dept. of Biological Sciences at the University of South Carolina for the diagrams and for their assistance in preparation of the manuscript.
Finally, I thank my wife, Purnima, and our son, Prakash, for their continuous support and interest in this work. I dedicate this work to Purnima and Prakash and above all to the memory of the mentors in my life, my parents and Professor E.L. Tatum. I am solely responsible for any and all errors that may be found in this book.
About the Author
Nawin Mishra received his PhD. in genetics from McMaster University in 1967. His postdoctoral training was with the late Nobel Laureate Professor E. L. Tatum at Rockefeller University, supported by a postdoctoral fellowship from the Jane Coffin Childs Memorial Fund for Medical Research at Yale University. In 1973, he joined the molecular biology faculty of the University of South Carolina as an associate professor; he remained there as Distinguished Professor of Genetics until 2006. Currently, Dr. Mishra is still with the University of South Carolina as Emeritus Distinguished Professor of genetics. Dr. Mishra was a visiting professor at the Max Planck Institute of Molecular Biology in Heidelberg, Germany, in 1980 and at the Greenwood Genetics Center in 2004. He initiated the gene-transfer experiments in fungi while he was a member of the laboratory of Dr. E. L. Tatum at Rockefeller University (1967–1973). He has investigated various aspects of gene transfer, the organization of mDNA, and the biochemical genetic characterization of proteins in carbohydrate and DNA metabolism.
Dr. Mishra has been invited to present his work in Australia, Europe, Russia, China, Japan, Thailand, and India. He served as a Scientific Consultant to the Food and Agriculture Organization (FAO) of the United Nations in 1990 and in 1993. He also served as Chairman of the Program Committee of the Genetics Society of America and as a member of the review panel of the Human Genome Project of the U.S. Department of Energy. He has served as a fellow of the American Association for the Advancement of Science since his election to this organization in 1986 for his original contributions to the study of gene transfer in fungi. Dr. Mishra has organized the Genetics Society of America annual meeting in 1978 and the First Fungal Genetics Congress in 1986; he has also written a book that was first published by CRC Press in 1995, and whose expanded version was published by John Wiley & Sons in 2002.
Chapter 1
Historical Perspectives
Biology becomes much more understandable in light of genetics (Ayala and Kiger 1984). This is true even more so in the case of the theory of evolution proposed by Darwin (1859). It seems the theory of evolution would have been placed on a solid foundation from the start if Darwin would have been aware of the Mendelian rules of inheritance. There is some indication that a copy of Mendel's publication was received by Darwin, which remained unopened during his lifetime. It is believed that this caused Darwin's failure to provide a firm basis on which selection works during the process of evolution.
Genetics has had several major breakthroughs during its development that have made biology a well-established discipline of science. Some of these break throughs are discussed here. The first major discovery was the rules of inheritance by Mendel (1866). This provided the particulate nature of inheritance and established the presence of genes, which control phenotypes. It also provided genes as the ultimate basis for propelling the process of evolution of organisms and integrated the different branches of the science of biology. In addition, Mendelian genetics transformed biology from a science based exclusively on observations to an experimental science where certain ideas could be tested by performing experiments.
The second major breakthrough was discovered by Beadle and Tatum (1941) with their conceptual one-gene–one-enzyme hypothesis. This proved the biochemical basis for the mechanism of gene action and integrated chemistry into biology. It provided the tool for analyzing metabolic pathways and several complex systems, including the nervous system. It also provided the understanding of the genetic basis of diseases and their possible cures by chemical manipulations and ultimately by gene therapy.
The discovery of the structure of DNA by Watson and Crick (1953) marked the third major breakthrough in biology. The discovery of the Watson–Crick DNA structure was aptly meaningful in view of the findings of DNA as the chemical basis of inheritance (Avery et al. 1944, Hershey and Chase 1952). The Watson–Crick structure of DNA provided the molecular basis for the understanding of the mechanisms of the storage and transmission of genetic information and possible changes (mutations) therein. Mutation provided the source of variations that could be selected for during the process of Darwinian evolution. Thus, the DNA structure created by Watson and Crick made genetics not only necessary but also unavoidable in the understanding of Darwin's evolution by natural selection. In 1962, Watson, Crick, and Wilkins received the Nobel Prize for this landmark discovery of the DNA structure.
The development of the Watson–Crick structure of DNA led to the birth of molecular biology followed by the enunciation of the central dogma in biology. Molecular biology attempted to provide the molecular basis for everything in biology and biochemistry leading to the unity of life. Molecular biology perpetuated the reductionistic view of living systems: Reductionists attempt to understand a system by understanding its molecular components. Molecular biology also led to the development of a better understanding of diseases and their control by pharmaceuticals. The field of molecular biology ushered in by the Watson–Crick DNA structure led to the development of scores of Nobel Prize-winning concepts in biology, biochemistry, and medicine as discussed later in this book.
The coming of genomics marked the fourth major breakthrough in biology. Advances in genome sequencing and availability of human and several other genome sequences by 2001 provided the basis for the understanding of the uniqueness of humans in possessing certain distinctive DNA segments. Genomics also provides the basis for the understanding of variations among individuals as differences in DNA sequences. Furthermore, it provides molecular insight into the genetic basis for differences in our response to the same drug. The variation in individual DNA sequences is expected to provide the molecular understanding of our several complex traits, including behavior. DNA sequences also provide a better insight into the record of the evolutionary processes in an organism. Genomics is expected to provide a better understanding of a complex organism like humans after the elucidation of the roles of noncoding sequences (introns) of DNA. Understanding the roles of introns is currently a formidable task: It is believed that the elucidation of the roles of introns will add a new dimension to the understanding of biology.
The fifth breakthrough underway is the development of proteomics. This is bringing a better understanding of biochemical pathways and the roles of protein interactions. Above all, proteomics provides a clue to answering the big question of how a small number of genes can control several phenotypes in a complex organism like humans. A major conceptual scheme emerging from proteomics is that it is the number of interactions of proteins and not the number of proteins per se that is responsible for the myriad phenotypes in an organism.
The sixth breakthrough that is in making involves the science of synthetic genetics which would allow creation of new organisms by creation of entirely new genomes or by the manipulation of existing ones with the help of the techniques of molecular genetics, genomics, proteomics and bioinformatics.
Advances in genomics and proteomics in conjunction with bioinformatics have made it possible to realize the dreams of the chemists of the 20th century. These chemists wanted to decipher the amino acid sequences of all proteins to understand their functions. Proteomics has made it possible to determine the amino acid sequence of any protein. In addition, future advances in genomics and proteomics are expected to bring several revolutions in medicine and will make personalized medicine a reality. Advances in proteomics are expected to integrate the reductionistic views of Watson and Crick into systems biology to show how molecular parts evolved and how they fit together to work as an organism. The latter is expected to provide the ultimate understanding of biology.
1.1 Introduction to Proteomics
The term “proteome” originates from the words protein and genome. It represents the entire collection of proteins encoded by the genome in an organism. Proteomics, therefore, is defined as the total protein content of a cell or that of an organism. Proteomics is the understanding of the structure, function, and interactions of the entire protein content of an organism. Proteins control the phenotype of a cell by determining its structure and, above all, by carrying out all functions in a cell. Defective proteins are the major causes of diseases and thus serve as useful indicators for the diagnosis of a particular disease. In addition, proteins are the primary targets of most drugs and thus are the main basis for the development of new drugs. Therefore, the study of proteomics is important for understanding their role in the cause and control of diseases and in the development of humans as well as that of other organisms.
Proteins are encoded by DNA in most organisms and by RNA in some viruses. In all cases except RNA viruses, DNA is transcribed into RNA, which is then translated into a protein. In case of RNA virus, however, RNA is translated directly into proteins. Initially, it was thought that one gene makes one enzyme, which controls a phenotype. However, this view has undergone tremendous changes in the last several decades mainly because of the discovery of the split nature of eukaryotic genes, which involves RNA splicing, the occurrence of RNA editing, and the phenomenon of RNA silencing. The split nature of gene, RNA splicing, RNA editing, and RNA silencing are discussed later in this chapter.
In eukaryotes, the coding sequences of a gene called exons are interrupted by the noncoding stretches of nucleotides called introns. The exons are spliced after removal of introns within a gene continuously (referred to as cis splicing) or discontinuously (referred to as alternate splicing) or between exons of different genes leading to transsplicing. The different modes of splicing of exons and posttranslational modifications of proteins are responsible for the abundance of proteins in eukaryotic organisms. In humans there are approximately 23,000 genes and more than 500,000 proteins.
The findings of suppressor genes and the split nature of genes may present apparent contradictions to the one-gene–one-enzyme hypothesis. However, with the coming of central dogma (Crick, 1958, 1970, Watson 1965, Mattick 2003, Lewin 2004) in biology and elucidation of the genetic code (Leder and Nirenberg 1964, Khorana 1968), it is understandable how suppressor genes work. Thus, the mechanism of action of suppressor genes does not contradict the original ideas implicit in Beadle and Tatum's one-gene–one-enzyme concept to any extent as it appears superficially. In light of central dogma, it is understandable that certain genes or DNA segments may code for different proteins or that the coding section of protein in DNA is distributed across a huge expanse of DNA interrupted by the noncoding sequences. It has become obvious that the one-gene–one-enzyme concept applies only to genes that encode one polypeptide and not to genes that have a split nature and can code more than one protein. Thus, the one-gene–one-enzyme concept is limited to the nature of the gene itself, just as Mendelian rules of inheritance apply only to the genes located in the nucleus and not to the genes that are located elsewhere in the cell beyond the nucleus. Thus, the Mendelian inheritance pertains to the location of the genes, whereas the one-gene–one-enzyme concept is limited to the nature of the gene itself.
Obviously, what Beadle and Tatum suggested is not an axiom but a rule, and certain situations just represent exceptions to their profound rule. It seems that nature too has the British view of rule that “exceptions prove the rule.” The history of science is full of such exceptions. The most glaring example of such an exception involves the central dogma in molecular biology described by Francis Crick, the codiscoverer of the DNA structure. Crick (1958, 1970) surmised that sequential information in DNA is transferred to RNA and then to protein from RNA and that the direction of this information transfer is fixed. However, later it was shown that RNA is reverse transcribed into DNA, and at times, messenger RNA (mRNA) is edited by the addition or removal of cytidine or uridine before its translation in to protein, which suggests that information in a DNA segment is not translated directly into protein as implicit in central dogma. This idea suggests that DNA makes RNA, which makes protein. Howard Temin and David Baltimore received the Nobel Prize in 1975 for demonstrating this reverse transfer of information from RNA to DNA. The other glaring example of such an exception includes the enzymes. It was James Sumner of the Cornell University who established that enzymes are proteins. Soon, enzymes became synonymous with proteins until Sydney Altman of Yale University and Thomas Cech of the University of Colorado showed independently that certain enzymes are made of RNA and not proteins. Sumner in 1946 and Altman and Cech in 1989 were awarded Nobel Prizes for their contributions to the science of chemistry. Thus, it seems that biology, like any other branch of science, is replete with instances of exceptions to the rules.
The Swedish scientist Berzelius (1838)1 named certain naturally occurring polymers as proteins. The fact that enzymes are proteins was established by Sumner (1946). Later, Sanger (1958)established that proteins are made up of a sequence of amino acids. The fact that an enzyme and a substrate (or an antibody and antigen) require precise complementary fit in their structures, just like a hand in a glove, to interact with each other was established by Linus Pauling in the 1940s. In addition to Sumner (1946), both Pauling (1954) and Sanger (1958) received Nobel Prizes for their work in chemistry. Most proteins have enzymatic functions, but several of them such as actin and fibrinoactin are structural components of cells. Proteins are major constituents of muscle, cartilage, and bones. Proteins are also responsible for the mobility of muscle cells. Certain proteins serve as receptors for different molecules or work as immunoglobulins or antigens, or proteins can serve as allergens or participate in transport of various molecules, such as oxygen or sex hormones. Many proteins are hormones, such as insulin or human growth hormone (HGH), which control important metabolic functions in humans and other organisms. The three-dimensional structure and chemical modifications of proteins are important for the understanding of their functions in different capacities.
Gorrod (1909) first described certain human disorders as inborn errors of metabolism and implied the genetic basis of these diseases. However, it was the genius of Beadle and Tatum (1941) that led to the establishment of the fact that a protein is encoded by a gene. Working with, Neurospora, they showed that the synthesis of a substance in a metabolic pathway was impaired in a mutant. They showed that by disabling the gene controlling the enzyme that catalyzed a biochemical reaction in a metabolic pathway, the mutant developed nutritional requirements for that substance. Such mutants could not be grown on a minimal medium, but their growth was possible only when a particular substance was added to the minimal medium. For example, a mutant with impaired synthesis of arginine could not be grown on a minimal medium, but its growth was possible only when arginine was added to the minimal medium. This method was also used to map the biochemical pathways.
Beadle and Tatum (1941) called this conceptual scheme the one-gene–one-enzyme hypothesis. This hypothesis has been modified in various ways. However, despite several exceptions to this rule of one gene encoding one enzyme, the main tenets of the one-gene–one-enzyme hypothesis have remained the cornerstone of biology. This concept has been instrumental for the merger of chemistry with genetics and for the development of molecular biology. This theory provides the standard method to assign a function to a protein by creating a mutant and then showing which protein has a defective function or which function has been impaired in a particular protein. Because of this hypothesis, it was possible to analyze and study viral, microbial, plant, and animal genetics. This has been the basis for creating knockout mutations and for in vitro mutagenesis. This hypothesis has proven crucial for the analysis of any basic genetic mechanism, such as DNA replication, repair, and recombination, and for establishing the role of a protein in any metabolic pathway. Finally, this theory by Beadle and Tatum has led to advances in agriculture, animal husbandry, pharmaceutical sciences, and medicine. The one-gene–one-enzyme hypothesis has been the basis for the understanding and alleviation of human diseases and for the development of gene therapy.
The one-gene–one-enzyme hypothesis implied that a mutant must have altered the protein. Beadle and Tatum could not demonstrate the defective nature of the protein in their mutants because of the lack of technology at that time. However, this was demonstrated first at the biochemical level by Mitchell and Lein (1948, Mitchell, et al. 1948) and by Yanofsky (1952, 2005a, 2005b) in tryptophan, which required mutants of Neurospora that lacked the enzyme tryptophan synthetase responsible for the synthesis of tryptophan. This concept was also demonstrated later at the molecular level by Ingram 1957 in the case of hemoglobin in persons who suffer from sickle cell anemia. Ingram showed that the sixth amino acid “glutamic acid,” which is found in the hemoglobin of a normal person, is replaced by valine in the hemoglobin of a sickle cell person. This one change from glutamic acid to valine is the basis for the blood disorders in a sickle cell person. Later, many other mutants were shown to lack a protein altogether or possess proteins with altered amino acid(s).
The one-gene–one-enzyme theory also implied the correspondence in the ordered position of nucleotides in a gene with the position of amino acid in the protein encoded by that gene. This colinearity in the structure of a gene and that of a protein was demonstrated independently by Yanofsky et al. (1964) and by Sarabhai, et al. (1964), as discussed later in this chapter.
1.2 Proteome and Proteomics
1.2.1 Proteins as the Cell's Way of Accomplishing Specific Functions
The proteome is defined as the total proteins encoded by the genome of an organism. Proteomics is the science of describing the identification and features of the proteome of an organism.
The term “proteome” was first used by Marc Wilkins in 1994 (Wilkins 1996). An effort to describe the total proteins of an organism was made independently by O'Farrell (1975) and by Klose (1975). They developed what is called two-dimensional (2D) gel electrophoresis by running gel electrophoresis of proteins in two planes at right angles to each other (O'Farrell 1975, Klose 1975). This method separated a complex mixture of more than 1100 proteins of Escherichia coli into distinct bands of individual components on the gel. Later, the science of proteomics was revolutionized by the application of mass spectrometry in conjunction with genomics for the separation and identification of proteins on a large scale.
The genome of an organism is static in the sense that it remains the same in all cell types all the time. In contrast, the proteome of an organism is dynamic, because it differs from one cell type to another and keeps changing even in the same cell type at the different stages of activity or different states of development. A change in the proteome is a reflection of differential activity of the genes dependent on the cell type to express the protein needed for a particular function. For example, blood cells predominantly express the hemoglobin gene to produce the hemoglobin protein required for the transport of oxygen, whereas pancreatic cells largely express the insulin gene, which produces the insulin peptide required for the entry of glucose molecules into cells.
Thus, the differential expression of genes is required for the production of different proteins because each protein controls a distinct function. The function of many proteins is listed in Table 1.1. In addition, the protein profile of a cell can vary depending on the different kinds of modification of the same protein; such modifications of protein may involve acetylation, phosphorylation, glycosylation, or association with lipid or carbohydrate molecules. These modifications in proteins occur as posttranslational events and alter the function of proteins. One example is the mitosis activator protein (MAP) kinase protein controlling the mitosis; this protein is activated by phosphorylation to give MAP Kinase (MAPK), MAP kinase kinase (MAPKK), and MAP kinase kinase kinase (MAPKKK). The role of protein modification in the control of cellular activity is discussed later in this book.
Table 1.1 Function of Different Proteins
FunctionProtein1. CatalystEnzymes (more than 90% of proteins)Catalyze biochemical reactions in the cell2. TransportHemoglobin (carrier of oxygen)Albumin (carrier of hormones)3. StructureCartilage/bone proteins4. Cellular skeletonActin, fibrinoactin5. HormoneInsulin, growth hormone6. AntibodyImmunoglobulins7. Antigens and allergensBacterial and viral proteins8. Mobility/muscle movementMyosin9. ReceptorsReceptor for cholesterol10. Cell communication/signalingTransduction proteins, junction proteins1.2.2 Pregenomic Proteomics
The role of proteins as enzymes in controlling a cellular activity was known much before its structure was elucidated. The conceptual breakthrough in deciphering the structure of a protein as a linear array of amino acids came from the enunciation of the one-gene enzyme concept. This conceptual breakthrough was materialized by certain technical advances. The technical advances included the development of machines for the analysis of the amino acid composition and for the determination of the sequence of the amino acids in a protein. With the help of these machines, the structure of proteins was elucidated one protein at a time for several years. Later, the introduction of the methodology of the 2D gel and that of mass spectrometry facilitated the simultaneous resolution of the structure of several proteins at the same time. Understanding the structure of several proteins at the same time aided by mass spectrometry was moved forward with the coming of genomics and bioinformatics. The methods of genomics deciphered the nucleotide sequence of DNA/genes in the chromosomes of various organisms. The methods of bioinformatics involved the use of computers and several software programs for analyzing the bulk of the nucleotide sequence of DNA of an organism. Bioinformatics is also used for deciphering the amino acid sequence of a protein from the sequence of nucleotides in a DNA molecule.
1.3 Genetics of Proteins
A genetic approach to understanding protein structure and function was dictated by the one-gene–one-enzyme hypothesis. This concept implied that the structure and function of proteins could be understood by the comparison of the protein obtained from the wild type and from mutant organisms. In reality, it became a routine method to understand the role of a protein in any metabolic or developmental pathway. Following this dictum, the hemoglobin molecules from normal humans and from sickle cell patients were compared. The hemoglobin of normal individuals was found to be different from the sickle cell patients in the sixth amino acid. Normal individuals possessed glutamic acid at this position, whereas the sickle cell patient possessed valine (Ingram 1956, 1957). Thus, one change in amino acid completely altered the structure and metabolic role of hemoglobin (Figure 1.1).
Figure 1.1 A comparison of the N-terminal amino acid sequence in the beta chain of hemoglobin of normal and sickle cell patients.
1.3.1 One-Gene—One-Enzyme Theory
