54,99 €
Plant Genes, Genomes and Genetics provides a comprehensive treatment of all aspects of plant gene expression. Unique in explaining the subject from a plant perspective, it highlights the importance of key processes, many first discovered in plants, that impact how plants develop and interact with the environment. This text covers topics ranging from plant genome structure and the key control points in how genes are expressed, to the mechanisms by which proteins are generated and how their activities are controlled and altered by posttranslational modifications.
Written by a highly respected team of specialists in plant biology with extensive experience in teaching at undergraduate and graduate level, this textbook will be invaluable for students and instructors alike. Plant Genes, Genomes and Genetics also includes:
Aimed at upper level undergraduates and graduate students in plant biology, this text is equally suited for advanced agronomy and crop science students inclined to understand molecular aspects of organismal phenomena. It is also an invaluable starting point for professionals entering the field of plant biology.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 580
Veröffentlichungsjahr: 2015
Cover
Title Page
Copyright
Acknowledgements
Introduction
About the Companion Website
Part I: Plant Genomes and Genes
Chapter 1: Plant Genetic Material
1.1 DNA is the genetic material of all living organisms, including plants
1.2 The plant cell contains three independent genomes the plant cell contains three independent genomes
1.3 A gene is a complete set of instructions for building an RNA molecule
1.4 Genes include coding sequences and regulatory sequences
1.5 Nuclear genome size in plants is variable but the numbers of protein-coding, non-transposable element genes are roughly the same
1.6 Genomic DNA is packaged in chromosomes
1.7 Summary
1.8 Problems
References
Chapter 2: The Shifting Genomic Landscape
2.1 The genomes of individual plants can differ in many ways
2.2 Differences in sequences between plants provide clues about gene function
2.3 SNPs and length mutations in simple sequence repeats are useful tools for genome mapping and marker assisted selection
2.4 Genome size and chromosome number are variable
2.5 Segments of DNA are often duplicated and can recombine
2.6 Some genes are copied nearby in the genome
2.7 Whole genome duplications are common in plants
2.8 Whole genome duplication has many effects on the genome and on gene function
2.9 Summary
2.10 Problems
Further reading
References
Chapter 3: Transposable Elements
3.1 Transposable elements are common in genomes of all organisms
3.2 Retrotransposons are mainly responsible for increases in genome size
3.3 DNA Transposons create small mutations when they insert and excise
3.4 Transposable elements move genes and change their regulation
3.5 How are transposable elements controlled?
3.6 Summary
3.7 Problems
References
Chapter 4: Chromatin, Centromeres and Telomeres
4.1 Chromosomes are made up of chromatin, a complex of DNA and protein
4.2 Telomeres make up the ends of chromosomes
4.3 The chromosome middles – centromeres
4.4 Summary
4.5 Problems
Further reading
References
Chapter 5: Genomes of Organelles
5.1 Plastids and mitochondria are descendants of free-living bacteria
5.2 Organellar genes have been transferred to the nuclear genome
5.3 Organellar genes sometimes include introns
5.4 Organellar mRNA is often edited
5.5 Mitochondrial genomes contain fewer genes than chloroplasts
5.6 Plant mitochondrial genomes are large and undergo frequent recombination plant
5.7 All plastid genomes in a cell are identical
5.8 Plastid genomes are similar among land plants but contain some structural rearrangements
5.9 Summary
5.10 Problems
Further reading
References
Part II: Transcribing Plant Genes
Chapter 6: RNA
6.1 RNA links components of the central dogma
6.2 Structure provides RNA with unique properties
6.3 RNA has multiple regulatory activities
6.4 Summary
6.5 Problems
References
Chapter 7: The plant RNA Polymerases
7.1 Transcription makes RNA from DNA
7.2 Varying numbers of RNA polymerases in the different kingdoms
7.3 RNA polymerase I transcribes rRNAs
7.4 RNA polymerase III recruitment to upstream and internal promoters
7.5 Plant-specific RNP-IV and RNP-V participate in transcriptional gene silencing
7.6 Organelles have their own set of RNA polymerases
7.7 Summary
7.8 Problems
References
Chapter 8: Making mRNAs – Control of transcription by RNA polymerase II
8.1 RNA polymerase II transcribes protein-coding genes
8.2 The structure of RNA polymerase II reveals how it functions
8.3 The core promoter
8.4 Initiation of transcription
8.5 The mediator complex
8.6 Transcription elongation: the role of RNP-II phosphorylation
8.7 RNP-II pausing and termination
8.8 Transcription re-initiation
8.9 Summary
8.10 Problems
References
Chapter 9: Transcription Factors Interpret cis-regulatory information
9.1 Information on when, where and how much a gene is expressed is codified by the gene's regulatory regions
9.2 Identifying regulatory regions requires the use of reporter genes identifying
9.3 Gene regulatory regions have a modular structure
9.4 Enhancers:
Cis
-regulatory elements or modules that function at a distance
9.5 Transcription factors interpret the gene regulatory code
9.6 Transcription factors can be classified in families
9.7 How transcription factors bind DNA
9.8 Modular structure of transcription factors
9.9 Organization of transcription factors into gene regulatory grids and networks
9.10 Summary
9.11 Problems
References
Chapter 10: Control of Transcription Factor Activity
10.1 Transcription factor phosphorylation
10.2 Protein–protein interactions
10.3 Preventing transcription factors from access to the nucleus
10.4 Movement of transcription factors between cells
10.5 Summary
10.6 Problems
References
Chapter 11: Small RNAs
11.1 The phenomenon of cosuppression or gene silencing
11.2 Discovery of small RNAs
11.3 Pathways for miRNA formation and function
11.4 Plant siRNAs originate from different types of double-stranded RNAs
11.5 Intercellular and systemic movement of small RNAs
11.6 Role of miRNAs in plant physiology and development
11.7 Summary
11.8 Problems
References
Chapter 12: Chromatin and Gene Expression
12.1 Packing long DNA molecules in a small space: the function of chromatin
12.2 Heterochromatin and euchromatin
12.3 Histone modifications
12.4 Histone modifications affect gene expression
12.5 Introducing and removing histone marks: writers and erasers
12.6 ‘Readers’ recognize histone modifications
12.7 Nucleosome positioning
12.8 DNA methylation
12.9 RNA-directed DNA methylation
12.10 Control of flowering by histone modifications
12.11 Summary
12.12 Problems
References
Part III: From RNA to Proteins
Chapter 13: RNA Processing and Transport
13.1 RNA processing can be thought of as steps
13.2 RNA capping provides a distinctive 5′ end to mRNAs
13.3 Transcription termination consists of mRNA 3′-end formation and polyadenylation
13.4 RNA splicing is another major source of genetic variation
13.5 Export of mRNA from the nucleus is a gateway for regulating which mrnas actually get translated
13.6 Summary
13.7 Problems
References
Chapter 14: Fate of RNA
14.1 Regulation of RNA continues upon export from nucleus
14.2 Mechanisms for RNA turnover
14.3 RNA surveillance mechanisms
14.4 RNA sorting
14.5 RNA movement
14.6 Summary
14.7 Problems
Further reading
References
Chapter 15: Translation of RNA
15.1 Translation: a key aspect of gene expression
15.2 Initiation
15.3 Elongation
15.4 Termination
15.5 Tools for studying the regulation of translation
15.6 Specific translational control mechanisms
15.7 Summary
15.8 Problems
Further reading
References
Chapter 16: Protein Folding and Transport
16.1 The pathway to a protein's function is a complicated matter
16.2 Protein folding and assembly
16.3 Protein targeting
16.4 Co-translational targeting
16.5 Post-translational targeting
16.6 Post-translational modifications regulating function
16.7 Summary
16.8 Problems
Further reading
References
Chapter 17: Protein Degradation
17.1 Two sides of gene expression – synthesis and degradation
17.2 Autophagy, senescence and programmed cell death
17.3 Protein-tagging mechanisms
17.4 The ubiquitin proteasome system rivals gene transcription
17.5 Summary
17.6 Problems
Further reading
Reference
Index
End User License Agreement
xi
xiii
xiv
xv
xvi
xvii
xix
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
99
100
102
101
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
121
122
123
124
125
126
127
128
129
130
131
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
185
186
187
188
189
190
191
192
193
194
195
196
197
199
200
201
202
203
204
205
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
233
234
235
236
237
238
239
Cover
Table of Contents
Introduction
Part I
Begin Reading
Figure 1
Figure 2
Figure 1.1
Figure 1.2
Figure 1.3
Figure 1.4
Figure 1.5
Figure 1.6
Figure 1.7
Figure 1.8
Figure 1.9
Figure 1.10
Figure 2.1
Figure 2.2
Figure 2.3
Figure 2.4
Figure 2.5
Figure 2.6
Figure 2.7
Figure 2.8
Figure 2.9
Figure 2.10
Figure 2.11
Figure 2.12
Figure 2.13
Figure 2.14
Figure 2.15
Figure 2.16
Figure 2.17
Figure 2.18
Figure 2.19
Figure 2.20
Figure 2.21
Figure 2.22
Figure 3.1
Figure 3.2
Figure 3.3
Figure 3.4
Figure 3.5
Figure 3.6
Figure 3.7
Figure 3.8
Figure 3.9
Figure 3.10
Figure 3.11
Figure 4.1
Figure 4.2
Figure 4.3
Figure 4.4
Figure 4.5
Figure 4.6
Figure 4.7
Figure 4.8
Figure 4.9
Figure 4.10
Figure 4.11
Figure 4.12
Figure 4.13
Figure 4.14
Figure 4.15
Figure 4.16
Figure 4.17
Figure 5.1
Figure 5.2
Figure 5.3
Figure 5.4
Figure 5.5
Figure 5.6
Figure 5.7
Figure 5.8
Figure 5.9
Figure 5.10
Figure 5.11
Figure 5.12
Figure 5.13
Figure 5.14
Figure 5.15
Figure 5.16
Figure 6.1
Figure 6.2
Figure 6.3
Figure 6.4
Figure 6.5
Figure 6.6
Figure 6.7
Figure 6.8
Figure 6.9
Figure 7.1
Figure 7.2
Figure 7.3
Figure 7.4
Figure 7.5
Figure 7.6
Figure 8.1
Figure 8.2
Figure 8.3
Figure 8.4
Figure 8.5
Figure 8.6
Figure 9.1
Figure 9.2
Figure 9.3
Figure 9.4
Figure 9.5
Figure 9.6
Figure 9.7
Figure 9.8
Figure 9.9
Figure 9.10
Figure 9.11
Figure 9.12
Figure 9.13
Figure 10.1
Figure 10.2
Figure 10.3
Figure 10.4
Figure 10.5
Figure 10.6
Figure 10.7
Figure 11.1
Figure 11.2
Figure 11.3
Figure 11.4
Figure 11.5
Figure 11.6
Figure 11.7
Figure 11.8
Figure 11.9
Figure 11.10
Figure 11.11
Figure 12.1
Figure 12.2
Figure 12.3
Figure 12.4
Figure 12.5
Figure 13.1
Figure 13.2
Figure 13.3
Figure 13.4
Figure 13.5
Figure 13.6
Figure 13.7
Figure 13.8
Figure 13.9
Figure 13.10
Figure 13.11
Figure 14.1
Figure 14.2
Figure 14.3
Figure 14.4
Figure 14.5
Figure 15.1
Figure 15.2
Figure 15.3
Figure 15.4
Figure 15.5
Figure 15.6
Figure 15.7
Figure 15.8
Figure 15.9
Figure 15.10
Figure 16.1
Figure 16.2
Figure 16.3
Figure 16.4
Figure 16.5
Figure 16.6
Figure 16.7
Figure 16.8
Figure 16.9
Figure 17.1
Figure 17.2
Figure 17.3
Figure 17.4
Figure 17.5
Figure 17.6
Table 1.1
Table 6.1
Table 7.1
Table 8.1
Table 11.1
Table 11.2
Table 12.1
Table 14.1
Table 16.1
Table 16.2
Table 16.3
Table 17.1
Table 17.2
Erich Grotewold
The Ohio State University, USA
Joseph Chappell
University of Kentucky, USA
Elizabeth A. Kellogg
Donald Danforth Plant Science Center, St. Louis, Missouri, USA
This edition first published 2015 © 2015 by John Wiley & Sons, Ltd
Registered office: John Wiley & Sons, Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial offices: 9600 Garsington Road, Oxford, OX4 2DQ, UK
The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
111 River Street, Hoboken, NJ 07030-5774, USA
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell.
The right of the author to be identified as the author of this work has been asserted in accordance with the UK Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and author(s) have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloging-in-Publication Data
Grotewold, Erich, author.
Plant genes, genomes, and genetics / Erich Grotewold, Joseph Chappell, Elizabeth Kellogg.
pages cm
Includes bibliographical references and index.
ISBN 978-1-119-99888-4 (cloth)— ISBN 978-1-119-99887-7 (pbk.) 1. Plant molecular genetics. 2. Plant gene expression.
3. Genomics. I. Chappell, Joseph, author. II. Kellogg, Elizabeth Anne, author. III. Title.
[DNLM: 1. Plants– genetics. 2. Genomics. 3. Plant Physiological Phenomena. 4. RNA, Plant– genetics. QK 981]
QK981.4.G76 2015
572.8′ – dc23
2014028955
A catalogue record for this book is available from the British Library.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Cover illustration by Debbie Maizels
In writing this book, we have enormously benefited from the advice and valuable comments from many colleagues, to whom we are truly indebted for their multiple suggestions and contributions. These colleagues include the following: Biao Ding (The Ohio State University, USA); Sherry Flint-Garcia (USDA-ARS, Columbia, Missouri, USA); Irene Gentzel (The Ohio State University, USA); Venkat Gopalan (The Ohio State University, USA); Art Hunt (University of Kentucky, USA); Rebecca Lamb (The Ohio State University, USA); Pal Maliga (Rutgers University, USA); Michael McMullen (USDA-ARS, Columbia, Missouri, USA); Craig Pikaard (Indiana University, USA); Mark Rausher (Duke University, USA); Keith Slotkin (The Ohio State University, USA); Jan Smalle (University of Kentucky, USA); David Somers (The Ohio State University, USA); Dan Voytas (University of Minnesota, USA); and Ling Yuan (University of Kentucky, USA). We also thank the anonymous reviewers who contributed with their experience in the classroom to improve the overall utility of this book for students and professors.
We also want to especially thank the current and past members of our research groups for their many important contributions to the growth of our knowledge of genes, genomes and genetics. Without their support, this book would not have been possible. Last but not least, we want to thank the funding agencies, particularly the National Science Foundation, the US Department of Agriculture, and the National Institutes of Health, for their continuous support of the research conducted in our laboratories.
One goal of this book is to highlight the aspects of molecular biology that are unique to plants, and that represent mechanisms that cannot be understood simply by studying animals, yeast or bacteria. We therefore need to spend some time discussing what we mean by the word “plant”, which, perhaps surprisingly, does not have a simple or universally accepted definition.
When most people think of a plant, they generally immediately come up with an image of a tomato plant, or a petunia, or corn. A scientist might think of Arabidopsis thaliana, the tiny weed that has been domesticated by molecular biologists. All these are examples of flowering plants (angiosperms), which are the dominant forms of land plants on Earth today. The flowering plants represent a large group that originated in the early Cretaceous (∼140 million years ago, although the exact date is subject to much current debate); the group has subsequently diversified to produce most trees, shrubs, and herbs. The flowering plants include more than 300 000 species; only a few thousand are cultivated, and surprisingly, only a few of these – fewer than twenty – produce the vast majority of the food for all of humanity.
The term “plant” is often used to mean “land plant”, a much larger group that includes the flowering plants, but also the gymnosperms, ferns, lycophytes, mosses, hornworts and liverworts. This large group is monophyletic, a term that refers to all being descendants of a common ancestor, and is often called the Embryophytes because all members produce embryos retained on the parent plant. A phylogeny of the Embryophyta is presented in Figure 1, which is assembled on the basis of the main characteristics that define the major groups of plants. Clades (or groups) within the land plants include the seed plants (flowering plants plus gymnosperms, distinguished by how they bear their seeds) and other vascular plants [ferns (pteridophytes) and lycophytes], in which the diploid sporophyte forms on the independent gametophyte, and dispersal occurs via spores. In contrast, the non-vascular plants (hornworts and liverworts) are distinguished not only by the absence of phloem and xylem vessels, but by having a dominant gametophytic (haploid) stage of life and only a short lived sporophytic (diploid) stage.
Figure 1 Phylogeny of organisms that originated with the primary endosymbiosis, in which a eukaryote acquired a symbiotic cyanobacterium. Superimposed on the phylogeny is a Venn diagram of major groups. While the green plants (Viridiplantae) and streptophytes are sometimes called “plants”, in this book we will use the term “plant” to refer to the land plants, the group shaded in green. Subgroups within the land plants are also indicated.
Another possible definition of “plant” is the group known as the Streptophytes, which includes the land plants plus their immediate relatives, Chara and Coleochaete (both formerly considered green algae). The Streptophytes all share a peculiar method of cell division, the phragmoplast, and a unique structure of proteins to make cellulose (the sugar polymer that is the primary component of plant cell walls), the cellulose rosettes.
A third definition of “plant” corresponds to organisms that have chloroplasts and make chlorophyll a and b. These are known as the Viridiplantae (Latin for green plants). This group includes the Streptophytes (i.e., land plants plus Coleochaete and Chara) plus all the green algae. The latter group includes the well-studied single cell organism Chlamydomonas.
Finally, a fourth (and uncommon) definition of “plant” includes all organisms with chloroplasts that are the result of a primary endosymbiosis, that is organisms that acquired their chloroplasts by directly aquiring a cyanobacterium (see Chapter 5). Members of this group are Viridiplantae, the red algae (Rhodophyta) and the glaucophytes. Some data suggest that the primary endosymbiosis occurred only once, in the common ancestor of the Viridiplantae, red algae and glaucophytes. Evidence is accumulating to suggest that indeed Viridiplantae, red algae and glaucophytes are all part of a monophyletic group, which is sometimes called the Archaeplastida. However, each primary endosymbiotic event could be independent, with the capture of a cyanobacterium occurring independently several times. In either case, origin of plastids from cyanobacteria has been extremely rare in the history of life.
The plastid bearing organisms diverged from other eukaryote lineages, including animals plus fungi, at least 1 billion years ago (Knoll, 2003). Given this enormously long period of evolution, it is remarkable that there are any similarities at all in the cellular apparatus between animals (i.e., you), fungi (i.e., yeast), and any plants. There are many similarities of course, but we suggest here that they need to be demonstrated, not assumed. In other words, the fact that the transcriptional machinery is similar between animals and yeast, does not necessarily mean that it will also be similar in plants. In addition, the term similarity does not mean identity. Processes in common could have arisen because of convergent forces, and really the metric for similarity has become conservation in the DNA encoding these functions.
In the past, the term “plant” was sometimes applied to all photosynthetic organisms. However, such a broad use of the term is now rejected. Many organisms that are able to undergo photosynthesis have gained that ability by acquiring a red alga along with its plastid. In other words, the plastid is a symbiont in the red alga and the red alga is the symbiont in another (previously non-photosynthetic) organism. Such symbioses are known as secondary endosymbioses to distinguish them from the primary endosymbioses of the Archaeplastida. In organisms with a secondary endosymbiont, the structure of the membranes around the symbiont shows that it was once a separate organism that was picked up by its host. Organisms with secondary endosymbioses include the Stramenopiles, the group that includes the brown algae (e.g., Fucus, a common seaweed) and golden brown algae (which occur mostly in freshwater), the dinoflagellates, and the kinetoplastids (e.g., Euglena, trypanosomes, and the apicomplexans, which include the organisms that cause malaria). Each of these groups is as different from plants as animals are, and as different from animals as plants are. In these organisms in particular one might expect to find novel genes, proteins and cellular mechanisms. If the term “plant” were applied to all photosynthetic organisms, the ones with the secondary endosymbioses are so diverse and so totally unrelated (other than all being eukaryotes) that the term would be effectively meaningless.
In summary, the term plant is used to apply to many sets of organisms, the smallest of which is the land plants and the largest is all photosynthetic organisms. Most commonly, however, “plant” refers either to the entire green plant lineage (Viridiplantae), or to the land plants. In common parlance its use is even more restricted to refer informally to flowering plants. In this textbook we will use the term to refer to land plants. Most of the data we present come from flowering plants, so in most cases, the reader can assume that we are extrapolating, generally without evidence, from the flowering plants to the gymnosperms, ferns, lycophytes, mosses, liverworts and hornworts. If we have data from species outside the land plants, we will cite that explicitly.
The processes described in this book can in theory occur in any cell in the plant. However, some familiarity with basic plant morphology is assumed. Plant growth occurs from dedicated sets of stem cells, known as meristems. These are active throughout the life of the plant, so that development is continuous and modular. This is quite different from the situation in animals, in which the entire organism develops in a coordinated fashion and then ceases development entirely at maturity. If a human were to grow like a plant, the fingers, toes and the top of the head might keep growing throughout the life of the human.
Meristems are organized during embryonic development. In the seed plants these initially consist of two clusters of cells, the shoot apical meristem and the root apical meristem, at opposite ends of the plant. These are the basis of the bipolar embryo, which is only found in the seed plants. Meristems in non-seed bearing vascular plants (ferns, lycophytes) consist of only a few cells, and the root apical meristem in particular develops late and on one side of the embryonic axis.
A flowering plant has an obvious above ground component, the shoot, and a below-ground component, generally the root (Figure 2). The apical meristem of the shoot produces leaves on its flanks. In the axil of each leaf, another meristem forms, the axillary meristem; this meristem is often dormant for a while but may grow out to form a branch. The root apical meristem forms the primary root. Lateral roots are not formed from the apical meristem, but rather are formed from meristems that arise de novo just outside the vascular tissue. In most eudicots, the primary root persists and forms a prominent below ground structure (think of a carrot or a dandelion root), whereas in most monocots, the primary root only lives for a few months and is replaced by roots forming from the very base of the shoot, near ground level (think of onion or grass roots). The vascular tissue connects all parts of the plant, transporting water, nutrients and some hormones up from the roots into the leaves and meristems of the shoot. At the same time carbohydrates and other hormones are transported both up and down from the leaves.
Figure 2 Structure of a flowering plant, indicating major organs, tissues and cell types. (a) Adapted from Taiz and Zeigler (1991). Reproduced with permission. (b) http://tpsbiology11student.wikispaces.com/Plants+-+Anatomy,+Growth,+and+Function. (c) Adapted from Taiz and Zeigler (1991). Reproduced with permission. (d) http://turfgrass.cas.psu.edu/education/turgeon/Modules/10_AnnualBluegrass/Annual_Bluegrass_Module/1.71%20root%20cross%20section.html. Reproduced with permission of A. J. Turgeon.
The basic tissues of the plant are obvious in cross sections of a leaf and a root (Figure 2). Unlike animals, which have an elaborate set of tissue types, plants have only three basic sorts of tissue – the epidermis, which covers all parts of the plant, the vascular tissue, and ground tissue, which includes everything else. The epidermis of a leaf is generally made up of flat translucent cells and is covered with a waxy layer, cuticle which prevents drying. Within the epidermis are specialized holes known as stomata (literally “mouths”) that permit entry of CO2 for photosynthesis and escape of O2, the by-product of photosynthesis. The stomata also permit the escape of water vapor. As water escapes, it creates a gradient of water pressure that pulls water up through the vascular system from the roots and hydrates all the cells in the plant. If water is limited, however, the stomata close and prevent drying of the tissues. Stomatal opening and closing is caused by changes in the turgor pressure of the guard cells, which sit on either side of the opening. In addition, the leaf epidermis can also have hairs (also known as trichomes) and glands. Depending on the plant species, they can be unicellular (as in Arabidopsis) or multicellular (e.g., the glands that accumulate peppermintoil in Mentha piperita).
The epidermis of a root includes some cells that will develop long projections known as root hairs. Root hairs are thin-walled, and are central to the uptake of water and nutrients from the soil. In addition, they are the site of interaction with soil bacteria, such as Rhizobia, which interact and form symbioses with some species of plants. Cells that will form root hairs alternate with non-root-hair cells. The pattern of root-hair and non-root-hair cells varies between species but is quite stereotyped in the model plant Arabidopsis where the controls of patterning have been studied extensively. As the root pushes through the soil and grows in diameter, the epidermis is sloughed off and replaced by cells from inner layers of the root. Because of this process, root hairs are only present right behind the root apical meristem and are lost in older roots.
Vascular tissue is arranged in bundles of conducting cells that extend throughout the plant. The water conducting tissue is xylem, which consists mostly of cells that are dead at maturity; in the vascular bundle of a leaf, the xylem is generally on the top (adaxial) side. Because the water is pulled up the plant following a pressure gradient from the roots to the shoots, the xylem cell walls must be strong enough to withstand the tension on the water column and hence are generally lignified at maturity. The carbohydrate transport tissue is phloem; phloem cells are alive at maturity and in a leaf are generally found on the bottom (abaxial) side of the vascular bundle. Unlike water, which is pulled up the plant under tension, the phloem sap is pushed around the plant under pressure. Because many molecules are dissolved in phloem sap, it generally is hyperosmotic to the surrounding tissues and takes up water.
The cells inside the epidermis and outside the vascular tissue are part of the ground tissue. Depending on the organ and the stage of development, these cells may vary considerably throughout the plant. An example is shown in Figure 2 in the cross section of the leaf, where the ground tissue is known as mesophyll.(The word mesophyll is simply Greek for “middle of the leaf”; meso = middle and phyll = leaf.) In many angiosperm leaves, the upper mesophyll cells form long, closely packed rectangles; because of their appearance in cross section they are known as the palisade layer. The lower mesophyll cells, in contrast, are less tightly packed and more isodiametric and are known as the spongy mesophyll. The cells of the spongy mesophyll cease cell division before those of the palisade, and are pulled apart as the leaf expands, creating air spaces between them.
In the root, the vascular tissue forms a solid cylinder in the center. It is surrounded by a ring of cells, the endodermis, which regulates the flow of water into and out of the vasculature. In most roots, water can enter through the cytoplasm of cells such as root hairs, or can enter the gaps between the cells. It then flows through or around the cells in the cortex. The pathway through cells is known as the symplastic pathway, whereas the pathway around cells is known as the apoplastic pathway. When the water reaches the endodermis, however, the water (and any ions or other substances) must go through the cells, that is, it is forced into a symplastic pathway. The endodermal cells are held in a tight ring by a layer of suberin, the Casparian strip, which prevents water or anything else from going around the cells; in other words, the Casparian strip blocks the apoplastic pathway. By controlling transporters and the osmotic force inside the cells, the endodermis thus controls which substances move in and out.
Individual plant cells have many of the same structures as other eukaryotic cells (Figure 2). Like all living cells (including Bacteria and Archaea), the plant cell is surrounded by a plasma membrane, uses DNA as its genetic material, and synthesizes proteins with ribosomes. Like all other Eukarya, the plant cell has a nucleus with a nuclear membrane that is contiguous with the endoplasmic reticulum, and has mitochondria, peroxisomes, and Golgi apparatus. The endoplasmic reticulum may be smooth or rough depending on whether ribosomes are attached to it. The cytoskeleton is made up of microfilaments (formed by the protein actin), intermediate filaments, and microtubules (formed by the protein tubulin).
Other structures are not shared with animals, although they may occur in other eukaryotes. Unlike most animals (e.g., any mammal), the plant cell is enclosed in a wall made up of cellulose. The wall often is penetrated by specialized tunnels known as plasmodesmata; the plasma membrane is continuous through these tunnels and the diameter of the plasmodesmata is tightly regulated by proteins that reside in the membrane. The plant cell contains chloroplasts, symbiotic bacteria that are the site of photosynthesis. Also many plant cells have a prominent vacuole surrounded by an independent membrane known as a tonoplast. In mesophyll cells the vacuole often fills up so much of the center of the cell that the cytoplasm and organelles are pressed to the edges against the plasma membrane. Compared with such cells, the vacuole drawn in Figure 2 is abnormally small.
This very brief introduction to plant structure and plant evolutionary history should provide a foundation for the rest of this book. In the chapters that follow, we will provide a view of biology focusing on aspects that characterize the many species of land plants. You have probably already read biology textbooks that focus on humans as a representative mammal, but recall that there are only about 5000 species of mammals, and among them humans have strikingly low genetic diversity. In contrast there may be almost 70–100 times as many species of land plants, although exact numbers are unknown. Plants dominate our environment, provide food, clothing and shelter, and create the air we breathe. Without land plants, there would be no land animals and certainly no humans. Plants thus support all of life on land; here we present their genes, genomes, and genetics.
Knoll, A.H. (2003)
Life on a Young Planet: The First Three Billion Years
, Princeton University Press.
Like all living organisms, plants use deoxyribonucleic acid (DNA) as their genetic material. DNA is a polymer that consists of alternating sugars and phosphates with nitrogenous bases attached to the sugar moiety. More specifically, the nucleotide building block of DNA is a deoxyribose sugar with a phosphate group attached to carbon 5 (C-5) and a nitrogenous base to carbon 1 (C-1). Phosphodiester bonds connect the C-5 phosphate group of one nucleotide to the carbon 3 (C-3) of another, creating the alternating sugar–phosphate backbone of the DNA molecule. This means that one end of the chain is terminated by a C-5 phosphate, and is known as the 5′ end, whereas the other end is terminated by a C-3 hydroxyl, and is known as the 3′ end (Figure 1.1a). The idea that DNA molecules have a polarity is one that will be revisited over and over throughout this book.
Figure 1.1 Structure of the DNA molecule. (a) Alternating phosphate and ribose groups make up the backbone of the DNA strand. The C-3 hydroxyl at the 3′ end is required for attaching new nucleotides to the chain, so the DNA strand always is extended from the 3′ end. The nitrogenous bases are attached to C-1 of the ribose. (b) The double helix, formed from two antiparallel strands of DNA. (c) The four nitrogenous bases and the hydrogen bonds between them. (a) and (c) Reece et al. (2011). (b) Adapted from Raven et al. (2011)
Only four nitrogenous bases are used in a DNA molecule. Two of these, cytosine (C) and thymine (T), have a single aromatic ring consisting of four carbons and two nitrogen groups, and are classified as pyrimidines. The other two, adenine (A) and guanine (G), each have a double ring consisting of a pyrimidine ring fused to a 5-membered, heterocyclic ring, and are classified as purines. The bases form a linear molecule, a strand of DNA that interacts with the nitrogenous bases on the other strand.
Two DNA polymers, or strands, together form the iconic double helix, a structure like a twisted ladder that has come to symbolize life and its historical continuity (Figure 1.1b). Even viruses, many of which have genomes of single-stranded nucleic acids, must eventually pass through a double-stranded stage to reproduce. The strands are held together by hydrogen bonds (H-bonds) between the nitrogenous bases, with two bonds between A and T, and three between G and C (Figure 1.1c). Since more H-bonds between bases hold them together more tightly, it is significantly easier to denature a DNA molecule with many A-T base pairs than one with many C-G base pairs. The pairing rules for DNA are largely inflexible: A forms H-bonds with T and G with C. The strands are arranged in antiparallel fashion, so that the 5′ end base of one strand pairs with the 3′ end base of the other, and vice versa.
The structure of DNA is not unique to plants, but rather is shared among all three domains of life (Eukarya, Bacteria, and Archaea), as well as by viruses. The patterns of covalent bonds and H-bonds can thus be studied in any organism, and indeed much of what we know about DNA structure was originally worked out in bacteria, which are unicellular and prokaryotic (lacking a nucleus).
The four bases (A, C, G and T) are not present in equal amounts and can vary between genomes, parts of genomes, and species. For example, the A+T content of the chloroplast genome, an organellar genome discussed later in Chapter 5, is variable, but generally greater than 50% of the total. In contrast, nuclear genes of many grasses are enriched in G+C, a bias that is particularly noticeable in maize.
In plants, the nucleotide bases may be modified by attachment of methyl (–CH3) groups to particular sites. A common position for DNA methylation is on C-5 of C (Figure 1.2), although adenine methylation is also possible, particularly in bacteria, Archaea, and unicellular eukaryotes. This common modification of the DNA is known to affect transcription, and will be discussed in more detail in Chapter 12. While methylation is also common in mammals, it is relatively rare in yeast and in the fruit fly (Drosophila melanogaster). Other insects, however, have extensive DNA methylation, as is the case of honeybees. Many aspects of biotechnology exploit the basic structure of DNA, as described in the box “Working with DNA.”
Figure 1.2 Structures of cytosine and 5-methylcytosine. http://en.wikipedia.org/wiki/File:Cytosine_chemical_structure.svg. By Engineer gena (Own work) [Public domain], via Wikimedia Commons
The polymerase chain reaction (PCR), a method used extensively in biotechnology for generating large numbers of similar or identical DNA fragments, relies on repeatedly increasing the temperature to separate the DNA strands and then decreasing it to allow primers to bind. It is thus important to know the temperature at which the strands of particular DNA molecules separate; this is known as the melting temperature, or Tm, and corresponds to the temperature at which half the DNA molecules are single-stranded (melted) and half are double-stranded. The Tm is controlled by several factors, but a major one is the fraction of G-C base pairs. Because G-C pairs are held together by three H-bonds (rather than two as in A-T pairs), breaking them requires more energy input, such as higher temperatures. A rough equation for the Tm of a short (<20 base pairs, bp) strand of DNA is:
where T is the temperature in degrees Centigrade, A+T is the total number of A-T base pairs and G+C is the total number of G-C base pairs.
Assuming a random nucleotide distribution, this rough equation makes two assumptions. The first is that one strand of the DNA is bound to a membrane, as it would be for a Southern blot, and that the blot is being probed with a short oligonucleotide (a single-stranded DNA molecule, generally <100 nucleotides long; the Greek prefix oligo- means “few”). With one strand immobilized, the DNA melts at a lower temperature (about 8°C less) than it would in solution. The second assumption is that the concentration of salt (e.g., NaCl) is 0.9 M, and that there is no chemical in the solution that would interfere with the formation of H-bonds between the bases (such as formamide, HCONH2). The melting temperature of DNA increases with the log10 of the concentration of salt. This means that the higher the salt concentration, the more stable the DNA heteroduplex. It also decreases linearly with the concentration of formamide or other similar small molecules that interfere with DNA H-bond formation. Thus, a more complete equation is:
where M is the molar concentration of cations (Na+ in this case), XG and XC are the mole fractions of G and C in the DNA, L is the length of the DNA (actually just the shortest strand of DNA in the mix), and F is the concentration of formamide.
The strong negative charge of DNA created by the phosphate backbone (on the outside of the double helix, Figure 1.1) can also be used to sort DNA molecules by size using gel electrophoresis, one of the most common tools in molecular biology. DNA is placed in a well at one end of a gel matrix (commonly an agarose or polyacrylamide gel), and an electrical current is run through the gel. The negatively charged DNA will thus migrate toward the positive electrode, and because the gel acts as a molecular sieve, the rate of DNA migration is dependent upon its size (Figure 1.3a). Smaller DNA fragments run faster than larger ones; graphing the log10 of the molecular weight or fragment length against the distance traveled produces a line (Figure 1.3b).
Figure 1.3 Gel electrophoresis, a powerful tool for biology. (a) DNA, suspended in an aqueous solution, is taken from a tube and placed in slots in a gel. An electrical current is applied to the gel. Because DNA is negatively charged, it moves toward the positive electrode, with smaller fragments moving more rapidly than larger ones. (b) Table and graph of the size of DNA fragments versus the distance migrated on a representative gel. Note that this relationship is not linear, so that the vertical axis of the graph is logarithmic. (b) Adapted from http://depts.noctrl.edu/biology/resource/handbook.htm
Certain chemicals bind to DNA and fluoresce when illuminated with an appropriate light source. For example, ethidium bromide has been widely used because it will intercalate into the double helix of the DNA; once there it will fluoresce under UV light. Thus, a common method of locating DNA on an agarose gel is to soak the gel in ethidium bromide and then place it on a UV light source. The DNA will then appear as pinkish bands (Figure 1.4a). Unfortunately, ethidium bromide will intercalate into the DNA of anything, including that of the biologist working with it. Because it can absorb light energy, it can damage the DNA; it is thus mutagenic. Its use is becoming less common because of its toxicity.
Figure 1.4 Chemicals that can be used to make DNA visible. (a) Ethidium bromide: (i) structure of ethidium bromide; (ii) an agarose gel stained with ethidium bromide and viewed with UV light; and (iii) ethidium bromide inserted into a DNA molecule (ai) http://www.sigmaaldrich.com, (aii) http://www.hamiltoncompany.com/products/syringes/c/893/. Reproduced with permission from Hamilton Company, (aiii)http://en.wikipedia.org/wiki/File:DNA_intercalation2.jpg. Image created by Karol Langner, via Wikimedia Commons. (b) Propidium iodide, from http://www.sigmaaldrich.com
Another fluorescent dye is propidium iodide, which, like ethidium bromide, intercalates into the DNA double helix (Figure 1.4b). Propidium iodide binds to DNA in a quantitative way, with one propidium iodide molecule per 4 or 5 bp; thus more DNA equals more propidium iodide binding, which equals more fluorescence. This direct relationship is used to estimate the genome size, which is the amount of DNA in the nucleus of a cell. Estimates of genome size using propidium iodide fluorescence are relatively rapid, so we have data on the genome size of many organisms, including flowering plants.
Methylated cytosines can be detected by a variety of methods. Restriction enzymes will cut DNA at particular sequences, but some will not cut DNA at 5-methyl cytosine, so that the methyl groups effectively protect the DNA from cleavage at these sites. Such methylation-sensitive restriction enzymes often have the same cut site sequence as methylation-insensitive enzymes. The restriction enzymes can thus be used sequentially. A restriction site that is cut with the methylation-insensitive enzyme and not with the methylation-sensitive one must be methylated. More recently, methods have been developed to find all the methylated cytosines in a genome. One such method is bisulfite sequencing, in which a genome is treated with sodium bisulfite, which converts methylated cytosines to uracils (Figure 1.5a). When the DNA is amplified by PCR the uracils become thymines. By comparing the sequence of the genome (or genomic region) before and after the bisulfite treatment, the cytosines that were methylated can be identified. The non-methylated cytosines are not affected by the bisulfite treatment, and thus remain the same (Figure 1.5b).
Figure 1.5 Bisulfite sequencing. (a) Chemical reactions necessary to remove the amino group from cytosine and convert it to uracil. This reaction will not occur if the cytosine is methylated. http://www.methylogix.com/genetics/bisulfite.shtml.htm. (b) Inferring the presence of 5-methylcytosine in a sequence of DNA. The DNA is treated with bisulfite and all non-methylated cytosines (C) are converted to uracil (U) (top line of sequences) whereas methylated cytosines are unchanged (bottom line of sequences). DNA is then amplified by PCR, which converts U to thymine (T), and the resulting sequence compared with the original. Any base that is C in the original but T in the amplified product is inferred to be non-methylated; conversely a base that is C in both the original and the amplified product must have been methylated. Redrawn from Hayatsu et al. (2008). Reproduced with permission of John Wiley & Sons, Inc.
Methylation-sensitive restriction enzymes are often used to generate genomic libraries, which are collections of DNA fragments cloned into autonomously replicating vectors such as a plasmid, phage (bacterial virus) or bacterial artificial chromosome (BAC).
The DNA in plant cells is found in the nucleus, the mitochondria and the chloroplasts. The latter two organelles are descendants of bacteria that were captured by a eukaryotic cell and have become endosymbionts; because many of the ancestral bacterial genes were transferred to the nucleus, the organelles can never revert to being free-living bacteria. Such microbial symbioses occur commonly. For example, cyanobacteria occur in the cells of some ferns, and in the stems of cycads. They co-occur with fungi to form lichens, although some lichens form from the symbiosis of a green alga and a fungus. There is even one recent example of a slug that has acquired plastids (Rumpho et al., 2008). The extent of gene transfer between members of the symbiosis varies greatly in these examples, and hence the ability of the bacterial endosymbionts to function independently varies accordingly.
Plant mitochondria and chloroplasts have circular genomes, similar to those in Bacteria and Archaea (Figure 1.6). Their translational machinery (ribosomal RNA and proteins) is more similar to that in their bacterial ancestors than to the eukaryotic ribosomes encoded by the nuclear genome. The organellar genomes will be discussed in more detail in Chapter 5.
Figure 1.6 The chloroplast genome exemplified by that of rice. Genes are indicated by colored boxes. Note that there is a set of genes, from rps15 to rpl23, that appear twice but in inverse order. These regions are known as the inverted repeat. Redrawn from Brown, 2002
There is ample evidence from multiple plant species that the broad organization of plant genomes is similar to that of most eukaryotes. In all eukaryotes, including plants, the DNA of the nucleus is organized into linear chromosomes. The ends of the chromosomes are marked by distinctive structures, the telomeres, which have characteristic DNA sequences and have important roles to play in DNA replication. These will be discussed in more detail in Chapter 4 (Figure 1.7).
Figure 1.7 A chromosome, shown at metaphase, with the sister chromatids joined at the centromere. The two chromatids are identical in sequence, but different sorts of sequences are highlighted on each one for clarity. Schmidt and Heslop-Harrison (1998). Reproduced with permission of Elsevier
The other major landmark of eukaryotic chromosomes is the centromere. When chromosomes are viewed through a microscope, centromeres appear as constrictions that divide each chromosome into two segments – the chromosome arms. Centromeres in plants, as in all other eukaryotes, provide the point of attachment for the spindle apparatus during cell division. Centromeres will be discussed in more detail in Chapter 4 (Figure 1.7).
DNA is a coded set of instructions for making RNA. The fate of the RNA is hugely varied and is the subject of Part 2 of this book. In general, RNA may function as an independent regulatory molecule (e.g., micro RNA), or as piece of cellular machinery (e.g., transfer RNA, ribosomal RNA), or as a set of instructions for making a protein (e.g., messenger RNA, or mRNA), or some combination of these. In other words, RNA is a central molecule in the life of a cell, and DNA is simply a storage mechanism and blueprint for preserving and expressing the information in the form of RNAs. The genome is thus the full set of RNA-producing instructions, along with the information on when (in developmental or ecological time) and where (in what tissue) to use a particular set of instructions.
The RNAs that make proteins serve two distinct masters. First, and most familiar, are the mRNAs that make proteins to carry out all the cellular functions of the plant, including enzymes, structural proteins, receptors, and transcription factors. The other mRNAs serve the transposable elements, which are mobile pieces of DNA that move around the genome. In any given genome, there are thousands of transposable elements, each of which produces mRNAs that encode proteins that participate in transposon movement (transposition).
With this view of the genome, we may consider a gene as a complete set of instructions for making a particular RNA, including the non-coding DNA sequences that provide information on when and where the coding sequences should be transcribed. Under this very broad definition, the total number of genes in any given genome is not known with any accuracy. Most commonly, when gene numbers are reported in the literature, the number includes only the mRNA-producing, protein-coding genes, excluding all those produced by the transposable elements. A few of these numbers are shown in Table 1.1, and are about the same order of magnitude (2–4 x 104) between different plants. However, these account for only a tiny fraction of the genome; they are vastly outnumbered by the protein-coding genes from the transposable elements and the genes encoding non-messenger RNAs. Even though they constitute only a minority of the genes in the genome, the mRNA producing, protein-coding, non-transposable element sequences are usually the ones simply called “genes” in the literature. We will follow this common usage here unless we specify otherwise.
Table 1.1 Genome Sizes of Selected Plants for which Genome Sequences are Available. Note that these are Heavily Biased Toward Plants with small Genomes, which may Affect our Ability to Generalize about Plant Genome Structure and Function. Note also that Genome Size does not Correlate with any Clear Evolutionary Relationships. For instance, Arabidopsis, a Common Experimental Angiosperm Species, has about the same Genome size as Chlamydomonas, a Green Alga Similar to Algae that existed Hundreds of Millions of Years Prior to Arabidopsis
Species
Classification
Genome size (Mbp)
Estimated number of protein-coding genes
Selaginella moellendorffii
(Banks
et al.
, 2011)
Lycophyte
106
22 285
Chlamydomonas reinhardtii
(Merchant
et al
., 2007)
Green alga
121
15 143
Arabidopsis thaliana
(Arabidopsis Genome Initiative, 2000)
Angiosperm
125
27 025
Oryza sativa
(Goff
et al.
, 2002; Yu
et al
., 2002)
Angiosperm
420
39 045
Physcomitrella patens
(Rensing
et al.,
2008)
Moss
480
35 938
Populus trichocarpa
(Tuskan
et al.
, 2006)
Angiosperm
485
41 335
Vitis vinifera
(Velasco
et al.
, 2007)
Angiosperm
505
29 585
Sorghum bicolor
(Paterson
et al.
, 2009)
Angiosperm
800
33 032
Zea mays
(Schnable
et al
., 2009)
Angiosperm
2300
39 475
Plant nuclear genes are similar in general structure to those of other eukaryotes. The overall architecture of a gene consists of two general components, the regulatory region and the coding or structural region of the gene (Figure 1.8). The regulatory region is responsible for controlling when a gene is transcribed into RNA. The regulatory region does not appear in the resulting mRNA, but directs the transcription machinery to start RNA biosynthesis (transcription) at a particular position, often but not necessarily, 3′ to the regulatory region. RNA transcription proceeds to the end of the gene generating a large precursor mRNA which will be processed in several ways. The portions of the gene that ultimately end up in the mature mRNA are known as the exons, whereas the portions of the RNA that get spliced out during processing are known as the introns, or intervening RNAs. In protein-coding genes, upstream (5′) of the coding sequence is a region that is transcribed into mRNA, but not translated to protein. This is referred to as the 5′ untranslated region (5′ UTR). A similar untranslated region occurs downstream of the last coding portion of the sequence and is known as the 3′ UTR. UTRs are part of the exons because they are present in the mature mRNA, even though they are not translated into proteins. The UTR regions of mRNAs are known to play roles in initiation of the translation process and stability of the mRNA. After transcription, the introns are spliced out of the messenger RNA, a 5′ cap is added to the 5′ UTR and a polyadenine (polyA) tail is added to the 3′ UTR. These RNA processing steps will be described in more detail in Chapter 13.
Figure 1.8 Steps in the formation of messenger RNA from DNA. The DNA for a single gene (a protein-coding sequence and all its control elements) is shown at the top. Note that the protein-coding sequence itself is a relatively small portion of the entire gene. The primary RNA transcript (pre-mRNA) also includes sequences upstream and downstream of the part that will be translated. The mature mRNA is formed by removing introns and adding a cap and a tail. Adapted from Campbell et al. (2005). While the structure shown is very common, the regulatory elements may occasionally be quite far away from the gene. Also the pre-mRNA may sometimes be spliced in several different ways, a process known as alternative splicing, which is discussed in more detail in Chapter 13
In Bacteria, Archaea, and their mitochondrial and chloroplast descendants, most DNA is made up of sequences that encode RNA or proteins, and the coding sequences are barely separated from each other. In contrast, in the nuclear genome of eukaryotes, DNA encoding RNAs may account for only ∼5% of the genome.
Individual plants and plant species differ in the amount of DNA in their genomes. DNA amounts are measured either as picograms (pg) of DNA per cell, or in numbers of base pairs (bp). Genome size is variable, with the largest genomes reported from the monocot Paris japonica (Melanthiaceae) (152.23 pg), and the smallest from the eudicot Genlisea margaretae (Lentibulariaceae) (0.063 pg) (Table 1.1).
Despite the variation in genome size, the number of protein-coding genes is surprisingly constant; in land plants it varies from approximately 22 300 for the lycophyte Selaginella (Banks et al., 2011) to 35 900 for the angiosperm crop, maize (Schnable et al., 2009). The difference in genome size thus cannot be accounted for by the genes themselves, but rather has to do with the size of the space between the protein-coding genes.
As can be seen from Table 1.1, the density of protein-coding, non-TE (non-transposable element) genes in the genome must be remarkably different among species. For example, while Oryza sativa (rice) has about 39 000 protein-coding, non-TE genes spread across 400 Mbp of DNA, sorghum has about 33 000 in 800 Mbp of DNA, and maize has about 39 000 genes in a 2300 Mb genome. Clearly the difference must be the size of the “spaces” between the genes.
The word “spaces” is in quotes because in fact there are plenty of genes and regulatory sequences outside the protein-coding genes (Figure 1.7). Over the last decade it has become clear that this DNA (sometimes formerly dismissed as “junk”) is a dynamic and active part of the genome and is as important for organismal function as the protein-coding fraction. Many of the discoveries about non-protein-coding genes have been made in plants.
Much of the space between the protein-coding non-TE genes is a complex mixture of repetitive sequences and transposable elements (Figure 1.9a). As noted above, transposable elements are mobile components of the genome with a propensity for creating repetitive bits of DNA sequence. The nature and dynamics of the transposable elements are discussed in the next chapter.
Figure 1.9 Types of sequences in the nuclear genome. (a) Genome sizes for flowering plants for which genome sequences are available. Pie charts show the relative amounts of DNA from transposable elements (TE-DNA) versus non-transposable elements (non-TE DNA). The linear relationship between the log of the genome size and the amount of transposable elements in the graph on the right shows that most of the difference in size is due to the difference in TE content. Tenaillon et al. (2010). Reproduced with permission of Elsevier. (b) DNA in the nucleus includes not only nuclear genes but also sequences from other sources such as organelles, viruses or transgenes inserted by humans. Nuclear DNA can then be classified into various categories. (b) Adapted from Heslop-Harrison and Schmidt (2007)
Other repetitive sequences are known as satellite DNA, which falls into several size classes (Figure 1.9
