90,99 €
Computational and high-throughput methods, such as genomics, proteomics, and transcriptomics, known collectively as "-omics," have been used to study plant biology for well over a decade now. As these technologies mature, plant and crop scientists have started using these methods to improve crop varieties. Omics in Plant Breeding provides a timely introduction to key omicsbased methods and their application in plant breeding. Omics in Plant Breeding is a practical and accessible overview of specific omics-based methods ranging from metabolomics to phenomics. Covering a single methodology within each chapter, this book provides thorough coverage that ensures a strong understanding of each methodology both in its application to, and improvement of, plant breeding. Accessible to advanced students, researchers, and professionals, Omics in Plant Breeding will be an essential entry point into this innovative and exciting field. * A valuable overview of high-throughput, genomics-based technologies and their applications to plant breeding * Each chapter explores a single methodology, allowing for detailed and thorough coverage * Coverage ranges from well-established methodologies, such as genomics and proteomics, to emerging technologies, including phenomics and physionomics Aluízio Borém is a Professor of Plant Breeding at the University of Viçosa in Brazil. Roberto Fritsche-Neto is a Professor of Genetics and Plant Breeding at the University of São Paulo in Brazil.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 406
Veröffentlichungsjahr: 2014
Cover
Title Page
Copyright
List of Contributors
Foreword
Chapter 1: Omics: Opening up the “Black Box” of the Phenotype
The Post-Genomics Era
The Omics in Plant Breeding
Genomics, Precision Genomics, and RNA Interference
Transcriptomics and Proteomics
Metabolomics and Physiognomics
Phenomics
Bioinformatics
Prospects
References
Chapter 2: Genomics
The Rise of Genomics
DNA Sequencing
Development of Sequence-based Markers
Genome Wide Selection (GWS)
Structural and Comparative Genomics
References
Chapter 3: Transcriptomics
Methods of Studying the Transcriptome
Applications of Transcriptomics Approaches for Crop Breeding
Conclusions and Future Prospects
Acknowledgements
References
Chapter 4: Proteomics
History
Different Methods for the Extraction of Total Proteins
Subcellular Proteomics
Post-Translational Modifications
Quantitative Proteomics
Perspectives
References
Chapter 5: Metabolomics
Introduction
Metabolomic and Biochemical Molecules
Technologies for Metabolomics
Metabolomic Database Analysis
Metabolomics Applications
Metabolomics-assisted Plant Breeding
Associative Genome Mapping and mQTL Profiles
Large-scale Phenotyping Using Metabolomics
Conclusion and Outlook
References
Chapter 6: Physionomics
Introduction
Early Studies on Plant Physiology and the Discovery of Photosynthesis
Biochemical Approaches to Plant Physiology and the Discovery of Plant Hormones
Genetic Approaches to Plant Physiology and the Discovery of Hormone Signal Transduction Pathways
Alternative Genetic Models for Omics Approaches in Plant Physiology
“Physionomics” as an Integrator of Various Omics for Functional Studies and Plant Breeding
Acknowledgements
References
Chapter 7: Phenomics
Introduction
Examples of Large-scale Phenotyping
Important Aspects for Phenomics Implementation
Main Breeding Applications
Final Considerations
References
Chapter 8: Electrophoresis, Chromatography, and Mass Spectrometry
Introduction
Two-dimensional Electrophoresis (2DE)
Chromatography
Mass Spectrometry
Data Analysis
References
Chapter 9: Bioinformatics
Introduction
The “Omics” Megadata and Bioinformatics
Hardware for Modern Bioinformatics
Software for Genomic Sequencing
Software for
Contig
Assembling
Assembly Using the Graph Theory
New Approaches in Bioinformatics for DNA and RNA Sequencing
Databases, Identification of Homologous Sequences and Functional Annotation
Annotation of a Complete Genome
Computational System with Chained Tasks Manager (
Workflow
)
Applications for Studies in Plants
Final Considerations
References
Chapter 10: Precision Genetic Engineering
Introduction
Zinc Finger Nucleases (ZFNs)
Transcription Activator-like Effector Nucleases (TALENs)
Meganucleases (LHEs: LAGLIDADG Homing Endonucleases)
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)
Implications and Perspectives of the use of PGE in Plant Breeding
References
Chapter 11: RNA Interference
Introduction
Discovery of RNAi
Mechanism of RNA Interference
Applications in Plant Breeding: Naturally Occurring Gene Silencing and Modification by Genetic Engineering
Resistance to Viruses
Host-induced Gene Silencing
Insect and Disease Control
Improving Nutritional Values
Secondary Metabolites
Perspectives
References
Index
End User License Agreement
ix
x
xi
xiii
1
2
3
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
229
230
cover
Table of Contents
Foreword
Chapter 1: Omics: Opening up the “Black Box” of the Phenotype
Figure 1.1
Figure 1.2
Figure 1.3
Figure 1.4
Figure 1.5
Figure 2.1
Figure 2.2
Figure 2.3
Figure 2.4
Figure 2.5
Figure 2.6
Figure 2.7
Figure 2.8
Figure 2.9
Figure 2.10
Figure 3.1
Figure 3.2
Figure 3.3
Figure 3.4
Figure 3.5
Figure 4.1
Figure 4.2
Figure 4.3
Figure 5.1
Figure 5.2
Figure 5.3
Figure 5.4
Figure 5.5
Figure 6.1
Figure 6.2
Figure 6.3
Figure 7.1
Figure 7.2
Figure 7.3
Figure 7.4
Figure 7.5
Figure 7.6
Figure 7.7
Figure 7.8
Figure 7.9
Figure 7.10
Figure 8.1
Figure 8.2
Figure 8.3
Figure 8.4
Figure 8.5
Figure 8.6
Figure 9.1
Figure 9.2
Figure 9.3
Figure 9.4
Figure 9.5
Figure 9.6
Figure 9.7
Figure 9.8
Figure 10.1
Figure 10.2
Figure 10.3
Figure 10.4
Figure 10.5
Figure 11.1
Figure 11.2
Figure 11.3
Table 1.1
Table 2.1
Table 3.1
Table 3.2
Table 3.3
Table 3.4
Table 5.1
Table 6.1
Table 6.2
Table 6.3
Table 6.4
Table 7.1
Table 11.1
Edited by
Aluízio Borém
University of viçosa, viçosa, MG, Brazil
Roberto Fritsche-Neto
University of São Paulo/ESALQ, Piracicaba, SP, Brazil
This edition first published 2014 © 2014 by John Wiley & Sons, Inc.
Editorial offices: 1606 Golden Aspen Drive, Suites 103 and 104, Ames, Iowa 50010, USA
The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
9600 Garsington Road, Oxford, OX4 2DQ, UK
For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com/wiley-blackwell.
Authorization to photocopy items for internal or personal use, or the internal or personal use of specific clients, is granted by Blackwell Publishing, provided that the base fee is paid directly to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. For those organizations that have been granted a photocopy license by CCC, a separate system of payments has been arranged. The fee codes for users of the Transactional Reporting Service are ISBN-13: 978-1-118-82099-5/2014.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names and product names used in this book are trade names, service marks, trademarks or registered trademarks of their respective owners. The publisher is not associated with any product or vendor mentioned in this book.
Limit of Liability/Disclaimer of Warranty: While the publisher and author(s) have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloging-in-Publication Data has been applied for
ISBN 978-1-118-82099-5 (paperback)
A catalogue record for this book is available from the British Library.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.
Cover images: iStock © pawel.gaul, iStock © Vladimirovic, iStock © emyerson
Werner Camargos Antunes
Department of Biology, Maringá State University/UEM, Maringá, PR, Brazil
Francisco J.L. Aragão
Embrapa Genetic Resources and Biotechnology, Brasília, DF, Brazil
Aluízio Borém
Department of Crop Science, Federal University of Viçosa, Viçosa, MG, Brazil
Ilara Gabriela F. Budzinski
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Lucimara Chiari
Embrapa Beef Cattle, Campo Grande, MS, Brazil
Joshua N. Cobb
DuPont Pioneer, Johnston, IA, USA
Fernando Cotinguiba
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Valdir Diola (
in memoriam
)
Department of Genetics, Rural Federal University of Rio de Janeiro/UFRRJ, Seropédica, RJ, Brazil
Roberto Fritsche-Neto
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Simone Guidetti-Gonzalez
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Abdulrazak B. Ibrahim
Embrapa Genetic Resources and Biotechnology, Brasília, DF, Brazil; Department of Biochemistry, Ahmadu Bello University, Zaria, Kaduna, Nigeria; and Department of Cell Biology, University of Brasilia, DF, Brazil
Frederico Almeida de Jesus
Department of Biological Sciences, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Carlos Alberto Labate
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Mônica T. Veneziano Labate
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Marcos Antonio Machado
Department of Biotechnology, Center for Citriculture Sylvio Moreira, Agronomical Institute of Campinas, Cordeirópolis, SP, Brazil
Valéria S. Mafra
Department of Biotechnology, Center for Citriculture Sylvio Moreira, Agronomical Institute of Campinas, Cordeirópolis, SP, Brazil
Luciano Carlos da Maia
Department of Crop Science/Eliseu Maciel School of Agronomy-FAEM, Federal University of Pelotas, Pelotas, RS, Brazil
Naciele Marini
Department of Crop Science/Eliseu Maciel School of Agronomy-FAEM, Federal University of Pelotas, Pelotas, RS, Brazil
Felipe G. Marques
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Danilo de Menezes Daloso
Department of Plant Biology, Federal University of Viçosa, Viçosa, MG, Brazil; and Max-Planck-Institute for Molecular Plant Physiology, Potsdam-Golm, Germany
Hugo Bruno Correa Molinari
Laboratory of Genetics and Biotechnology, Embrapa Agroenergy, Brasília, DF, Brazil
Fabrício E. Moraes
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Ivan Miletovic Mozol
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Thiago J. Nakayama
Department of Crop Science, Federal University of Viçosa, Viçosa, MG, Brazil
Alexandre Lima Nepomuceno
Embrapa Soybean, Londrina, PR, Brazil
Antônio Costa de Oliveira
Department of Crop Science/Eliseu Maciel School of Agronomy-FAEM, Federal University of Pelotas, Pelotas, RS, Brazil
J. Miguel Ortega
Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, MG, Brazil
Lázaro Eustáquio Pereira Peres
Department of Biological Sciences, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Thaís Regiani
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Maria Juliana Calderan Rodrigues
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Carolina Munari Rodrigues
Department of Biotechnology, Center for Citriculture Sylvio Moreira, Agronomical Institute of Campinas, Cordeirópolis, SP, Brazil
Daniel da Rosa Farias
Department of Crop Science/Eliseu Maciel School of Agronomy-FAEM, Federal University of Pelotas, Pelotas, RS, Brazil
Janaina de Santana Borges
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Fabrício R. Santos
Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, MG, Brazil
Danielle Izilda R. da Silva
Department of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
Maria Laine P. Tinoco
Embrapa Genetic Resources and Biotechnology, Brasília, DF, Brazil
Agustin Zsögön
Department of Biological Sciences, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
The application of the omics in plant breeding offers outstanding opportunities to contribute to the well being of mankind. These opportunities come about when new varieties of food, feed, fiber, and fuel crops are developed that increase productivity and confidence in the product. Such varieties have become available through traditional breeding and the use of biotechnology and they are being grown on both large and small farms. Ultimately any improved performance benefits society as a whole. Furthermore, there are good prospects for the future through the increasing opportunities associated with plant breeding, especially from the new science of omics. Many traits in the major crops, such as resistance to disease and insects, deserve more attention, and in small acreage crops plant breeding programs merit greater consideration.
This book was written to provide a broad, integrated treatment of the subjects of, for example, genomics, proteomics, metabolomics, and it relies heavily on information gleaned by the authors throughout their research careers. The fundamental principles of genetics and the background information needed for plant breeding programs are emphasized.
The intention is that the book will be used by new and advanced students, as well as serving as a reference book for those interested in the independent study of omics. Instructors are encouraged to select specific chapters to meet classroom needs depending on the desired level of teaching and the time available. Readers will also benefit from the list of references that accompany each chapter.
Aluízio BorémViçosa, MG, BrazilandRoberto Fritsche-NetoPiracicaba, SP, BrazilEditors
Roberto Fritsche-Neto and Aluízio Borémb
aDepartment of Genetics, University of São Paulo/ESALQ, Piracicaba, SP, Brazil
bDepartment of Crop Science, Federal University of Viçosa, Viçosa, MG, Brazil
From the time that is believed agriculture began, in approximately 10 000 BC, people have consciously or instinctively selected plants with improved characteristics for cultivation of subsequent generations. However, there is disagreement as to when plant breeding became a science. Plant breeding became a science only after the rediscovery of Mendel's laws in 1900. However some scientists disagree with this view. It was only in the late 19th century that the monk Gregor Mendel, working in Brno, Czech Republic, uncovered the secrets of heredity, thus giving rise to genetics, the fundamental science of plant breeding.
Scientists added a few more pieces to the puzzle that was becoming this new science in the first half of the 20th century by concluding that something inside the cells was responsible for heredity. This hypothesis generated answers and thus consequent new hypotheses, leading to the continuing accumulation of knowledge and progress in the field. For example, the double helix structure of DNA was elucidated in 1953 (Table 1.1). Twenty years later, in 1973, the first experience with genetic engineering opened the doors of molecular biology to scientists. The first transgenic plant, in which a bacterial gene was inserted stably into a plant genome, was produced in 1983. Based on these advances, futuristic predictions about the contribution of biotechnology were published in the media, both by laypeople and scientists themselves, creating great expectations for its applications. Euphoria was the tone of the scientific community. Many companies, both large and small, were created, encouraged by the prevailing enthusiasm of the time (Borém and Miranda, 2013).
Table 1.1Chronology of major advances in genetics and biotechnology relevant to plant breeding. Adapted from Borém and Fritsche-Neto (2013).
Year
Historical landmark
1809–1882
Charles Darwin develops the theory of natural selection: those individuals most adapted to their environment are selected, survive, and produce more offspring.
1865
Gregor Mendel establishes the first statistical methodologies applicable to plant breeding, giving rise to the “era of genetics,” with his studies on the traits of pea seeds.
1910
Thomas Morgan, studying the effects of genetic recombination in
D. melanogaster
, demonstrates that genetic factors (genes) are located on chromosomes.
1941
George Beadle and Edward Tatum demonstrate that a gene produces a protein.
1944
Barbara McClintock elucidates the process of genetic recombination by studying satellite chromosomes and genetic crossing-over related to linkage groups in chromosomes 8 and 9 of maize.
1953
James Watson and Francis Crick, using X-ray diffraction, propose the double helix structure of the DNA molecule.
1957
Hunter and Markert develop biochemical markers based on the expression of enzymes (isoenzymes).
1969
Herbert Boyer discovers restriction enzymes, opening new perspectives for DNA fingerprinting and the cloning of specific regions.
1972
Recombinant DNA technology begins with the first cloning of a DNA fragment.
1973
Stanley Cohen and Herbert Boyer perform the first genetic engineering experiment on a microorganism, the bacterium
Escherichia coli
. The result was considered to be the first genetically modified organism (GMO).
1975
Sanger develops DNA sequencing by the enzymatic method; in 1984, the method was improved, and the first automatic sequencers were built in the 1980s.
1977
Maxam and Gilbert develop DNA sequencing by chemical degradation.
1980
Botstein
et al
. develop the RFLP (Restriction Fragment Length Polymorphism) technique for genotypic selection.
1983
The first transgenic plant is produced, a variety of tobacco into which a group of Belgian scientists introduced kanamycin antibiotic resistance genes.
1985
Genentech becomes the first biotech company to launch its own biopharmaceutical product, human insulin produced in cultures of
E. coli
transformed with a functional human gene.
1985
The first plant with a resistance gene against Lepidoptera is produced.
1986
The first field trial of transgenic plants is conducted in Ghent, Belgium.
1987
The first plant tolerant to a herbicide, glyphosate, is created.
1987
Mullis and Faloona identify thermostable Taq DNA polymerase enzyme, which enabled the automation of PCR.
1988
The first transgenic cereal crop, Bt maize, is developed.
1990
Rafalski
et al
. (1990) develop the first genotyping technique using PCR, RAPD (Random Amplified Polymorphism DNA).
1990
New tools for NCBI sequence alignment are created (BLAST—Basic Local Alignment Search Tool) (National Center of Biotechnology and Information—
www.ncbi.gov
).
1994
The first permit is issued for the commercial cultivation and consumption of a GMO, the Flavr Savr tomato.
1997
The first plant containing a human gene, the human protein c-producing tobacco, is produced.
2000
The first complete sequencing of a prokaryotic organism, the bacterium
E. coli
, is conducted.
2003
The first eukaryotic genome sequence, that of the human, is released by two major independent research groups in the United States.
2005
Large-scale sequencing (NGS—Next Generation Sequencing) is used as a tool to unravel whole genomes quickly.
2011
Second- and third-generation large-scale sequencing systems are developed; eukaryotic genomes are sequenced in just a few hours.
2012
Technologies that control the temporal and spatial expression of genes are used in genetic transformation and the exclusion of auxiliary genes.
2014
…
Large-scale sequencing, macro- and microsynteny, associative mapping, molecular markers for genomic selection, QTL (quantitative trait loci) cloning, and large-scale phenotyping are widely used, the use of “omics” and specific, multiple GMO phenotypic traits is expanded and bioinformatics is used intensively.
Many earlier predictions have now become reality (Table 1.1), leading to the consensus that each year the benefits of biotechnology will have a greater impact on breeding programs. Consequently, new companies have been established to take advantage of innovative, highly promising business opportunities.
In the late 20th and early 21st centuries, genome sequencing studies developed rapidly. Gene sequences are now available for entire organisms, including humans. After these DNA base sequences are determined, it is necessary to organize them and identify the coding regions and their functions in the organism.
In this context, with a huge range of sequences being deposited in databases, geneticists are faced with a challenge as great as that which propelled the “genomics era”: correlating structure with function. This challenge has given rise to functional genomics, the science of the “era of omics.”
Omics is the neologism used to refer to the fields of biotechnology with the suffix omics: genomics, proteomics, transcriptomics, metabolomics, and physiognomics, among others. These new tools are helping to develop superior cultivars for food production or even allowing plants to function as biofactories. The focal point for the 21st century will be the technological development of large-scale molecular studies and their integration into systems biology. These studies aim to understand the relationship between the genome of an organism and its phenotype, that is, to open up the “black box” that contains the path between codons and yield or resistance to biotic or abiotic stresses (Figure 1.1). Thus, systems biology is a science whose objectives are to discover, understand, model, and design the dynamic relationships between the biological molecules that make up living beings to unravel the mechanisms controlling these parts.
Figure 1.1 Systems biology: from genome to phenotype.
In recent years, genetics and omics tools have revolutionized plant breeding, greatly increasing the available knowledge of the genetic factors responsible for complex traits and developing a large amount of resources (molecular markers and high-density maps) that can be used in the selection of superior genotypes. Among the existing omics tools, global transcriptome, proteome, and metabolome profiles created using EST, SAGE, microarray, and, more recently, RNA-seq libraries have been the most commonly used techniques to investigate the molecular basis of the responses of plants, tissues/organs or developmental stages to experimental conditions (Kumpatla et al., 2012). However, regardless of the omics used, the aid of bioinformatics is required for the analysis and interpretation of the data obtained.
Given the importance of these fields, subsequent chapters will discuss the various tools currently in use, or with great potential for future use, in plant breeding. The roles of these fields, the relationships between them and their corresponding biological processes (as well as their presentation in this book) can be visualized by the “trail” of omics, as shown in Figure 1.2.
Figure 1.2 Trail of omics and the relationships among the fields and their corresponding biological processes.
The initial draft of the genome of the first plant to be sequenced (Arabidopsis thaliana) took approximately ten years to be developed. Today, with the use of the next generation of DNA sequencing technology (NGS, Next-Generation Sequencing) (e.g., Oxford Nanopore, PacBio RS, Ion Torrent, and Ion Proton, among others) and powerful bioinformatics and computational modeling programs, genomes can be sequenced, assembled, and related to the phenotypic traits specific to each genotype within a few weeks. This capability, combined with the drastic reduction in the cost of sequencing, has enabled the generation of an ever-increasing volume of data, thus enabling the comprehensive study of genomes and the development of informative molecular markers.
All of this information has inspired the development of new strategies for genetic engineering. However, until recently, the available genetic engineering tools could only introduce changes into larger blocks of DNA sequences, which could subsequently be inserted only at random in the genome of a target species. Recent advances in this field have made it possible to obtain new variations from site-directed modifications, including specific mutations, insertions, and substitutions of genes and/or blocks of genes, making genetic engineering a precise and powerful alternative for the development of new cultivars.
These modifications to specific DNA sequences are initiated by generating a break on the double-stranded target DNA (Double Stranded DNA Break, DSB). Genetically modified nucleases are designed to identify the specific site of the target genome and catalyze the creation of the DSB, enabling the desired DNA modifications to occur at the specific break site or close to it.
To access specific sites, three enzymes have been genetically modified or constructed: zinc finger nucleases (ZFNs) (Figure 1.3), transcription activator-like effector nucleases (TALENs) (Figure 1.4) and meganucleases, also known as LAGLIDADG hormone endonucleases (LHEs).
Figure 1.3 Zinc finger nucleases (ZFNs).
Figure 1.4Transcription activator-like effector nucleases (TALENs).
Another widely used technique is post-transcriptional gene silencing (PTGS), or RNA interference (RNAi). This technique has assisted the development of transgenic plants capable of suppressing the expression of endogenous genes and foreign nucleic acids (Aragão and Figueiredo, 2008).
Knowledge about the mechanisms involved in RNA-mediated gene silencing has been important in the understanding of the biological function of genes, the interaction between organisms, and the development of new cultivars, among other applications.
The RNAi pathway begins with the presence of double-stranded RNA (dsRNA) in the cytoplasm, which may vary in origin and size (Figure 1.5). These dsRNAs are cleaved by the Dicer enzyme, a member of the RNase III nuclease family. After the processing of the dsRNA, small interfering RNAs (siRNAs) are formed, which are then integrated into an RNA-induced silencing complex (RISC). The RISC is responsible for the cleavage of a specific mRNA target sequence.
Figure 1.5 Pathways of gene silencing in plant cells. (Source: Based on Souza et al., 2007).
Transcriptomics is the study of the transcriptome, defined as the set of transcripts (RNAs), including messenger RNAs (mRNAs) and non-coding RNAs (ncRNAs), produced by a given cell, tissue or organism (Morozova et al., 2009).
A single organism can have multiple transcriptomes. An organism's transcriptome varies depending on several factors: different tissues or organs and developmental stages of the same individual may have different transcriptomes, and different environmental stimuli may also induce differences. Transcriptomics is currently one of the main platforms for the study of an organism's biology. The methods of the differential expression analysis of transcripts have spread to almost every field of biological studies, from genetics and biochemistry to ecology and evolution (Kliebestein, 2012). Thus, numerous genes, alleles and alternative splices have been identified in various organisms.
In the same way, proteomics is the study of the proteome, which includes the entire set of proteins expressed by the genome of a cell, tissue or organism. However, this study can be directed only to those proteins that are expressed differentially under specific conditions (Meireles, 2007). Thus, proteomics involves the functional analysis of gene products, including the large-scale identification, localization, and compartmentalization of proteins, in addition to the study and construction of protein interaction networks (Aebersold and Mann, 2003).
Proteomics searches for a holistic view of an individual by understanding its response after a stimulus, with the end goal of predicting some biological event. This field has developed primarily through the separation of proteins by two-dimensional gel electrophoresis and chromatographic techniques (Eberlin, 2005).
Along with the advancement of research in the fields of genomics and proteomics, another area has gained prominence since the year 2000: metabolomics. This science seeks to identify the metabolites involved in the different biological processes related to the genotypic and phenotypic characteristics of a particular individual.
Plants metabolize more than 200 000 different molecules involved in the structure, assembly, and maintenance of tissues and organs, as well as in the physiological processes related to growth, development, and reproduction. Metabolic pathways are complex and interconnected, and they are, to some degree, dependent on and regulated by their own products or substrates, as well as by their genetic components and different levels of gene regulation.
This observation shows the great capacity for modulation or plasticity of the physiological response networks of plants under the same hierarchical control (DNA). Through the combined and simultaneous analysis of more than one regulatory level, such as the association of molecular markers and metabolic comparisons, a complex set of data can be generated, that is, the physiognomy. This science, in turn, generates systemic models aiming to understand and predict plant responses to certain stimuli and/or environmental conditions.
The field of phenomics employs a series of “high-throughput” techniques to enhance and automate the ability of scientists to accurately evaluate phenotypes, as well as to eventually reduce the determinants of phenotype to genes, transcripts, proteins, and metabolites (Tisné et al., 2013).
The phenome of an organism is dynamic and uncertain, representing a set of complex responses to endogenous and exogenous multidimensional signals that have been integrated during both the evolutionary process and the developmental history of the individual. This phenotypic information can be understood as a set of continuous data that change during the development of the species, the population, and the individual in response to different environmental conditions.
The emphasis of phenomics is phenotyping in an accurate (able to effectively measure characteristics and/or performance), precise (little variance associated with repeated measurements), and relevant manner within acceptable costs. This focus is important because phenotyping is currently the main limiting factor in genetic analysis. Unlike genotyping, which is highly automated and essentially uniform across different organisms, phenotyping is still a manual, organism-specific activity that is labor intensive and is also very sensitive to environmental variation.
The following are examples of phenomics approaches: (i) the use of digital cameras to take zenithal images for the automatic analysis of leaf area and rosette growth and the measurement of the characteristics of tissues, organs or individuals (Tisné et al., 2013); (ii) the use of infrared cameras to visualize temperature gradients, which can indicate the degree of energy dissipation (Munns et al., 2010) and have implications for responsiveness to drought stress and photosynthetic rate; (iii) the use of images generated by fluorescence detectors to identify the differential responses of populations of seedlings, fruits or seeds to a stressor (Jansen et al., 2009); (iv) the use of noninvasive methods to visualize subterranean systems (Nagel et al., 2012); and (v) the use of LIDAR (Light Detection and Ranging) technology to measure growth rate through differences between small distances measured using a laser (Hosoi and Omasa, 2009).
All these instruments generate objective digital data that can be transmitted to remote servers, many of which are connected to the Internet, for storage and further analysis, which is also often automated. The prospects for these technologies are very promising for breeding programs, which increasingly evaluate greater numbers of individuals.
The exponential increase in the volume of both molecular and phenotypic data requires increased computational capacity for its storage, processing, and analysis. To this end, numerous computers and analytical tools have been developed to address the massive volume of data originating from genomics, proteomics, metagenomics, and metabolomics, among other omics.
Biological data are relatively complex compared with those from other scientific fields, given their diversity and their interrelationships. All this information can only be organized, analyzed, and interpreted with the support of bioinformatics.
Bioinformatics can be defined as the field that covers all aspects of the acquisition, processing, storage, distribution, analysis, and interpretation of biological information. A number of tools that aid in understanding the biological significance of omics data have been developed through the combination of procedures and techniques from mathematics, statistics, and computer science. In addition, the creation of databases with previously processed information will accelerate research in other biological fields, such as medicine, biotechnology, and agronomy.
Plant breeding is an art, science, and business that is little more than a century old. Using methods developed mainly in the 20th century, breeders have developed agronomically superior cultivars. Because of the constantly increasing challenge of agricultural food production, plant breeding must evolve and use new knowledge. Therefore, the omics will gradually assume greater relevance and be incorporated into the routines of breeding programs, making them more accurate, fast, and efficient. Although the challenges are great, the prospects are even greater.
Aebersold, R.H.; Mann, M. 2003. Mass spectrometry-based proteomics.
Nature
,
422
(6928): 198–207.
Aragão, F.J.L; Figueiredo, S.A. 2008. RNA interference as a tool for plant biochemical and physiological studies. In: Rivera-Domínguez, M., Rosalba-Troncoso, R., and Tiznado-Hernández, M.E. (eds.)
A Transgenic Approach in Plant Biochemistry and Physiology
. Kerala, India: Research Signpost, pp. 17–50.
Borém, A.; Fritsche-Neto, R. 2013.
Biotecnologia aplicada ao melhoramento de plantas
. 6th edn. Visconde do Rio Branco: Suprema Publishers, 336 pp.
Borém, A.; Miranda, G.V. 2013.
Melhoramento de plantas
. 6th edn. Viçosa: UFV Publishers, 523 pp.
Eberlin, M. 2005. A proteômica e os novos paradigmas. Reportagem Jornal da Unicamp, Universidade Estadual de Campinas. UNICAMP. p. 3. 14–27 November, 2005.
Hosoi, F.; Omasa, K. 2009. Detecting seasonal change of broad-leaved woody canopy leaf area density profile using 3D portable LIDAR imaging.
Functional Plant Biology
,
36
(11): 998–1005.
Jansen, M.; Gilmer, F.; Biskup, B.;
et al
. 2009. Simultaneous phenotyping of leaf growth and chlorophyll fluorescence via GROWSCREEN FLUORO allows detection of stress tolerance in
Arabidopsis thaliana
and other rosette plants.
Functional Plant Biology
,
36
(11): 902–914.
Kliebenstein, D.J. 2012. Exploring the shallow end; estimating information content in transcriptomics studies.
Frontiers in Plant Science
,
3
: 213.
Kumpatla, S.P.; Buyyarapu, R.; Abdurakhmonov, I.Y.; Mammadov, J.A. 2012. Genomics-assisted plant breeding in the 21st century: Technological advances and progress. In: Abdurakhmonov, I. (ed.).
Plant Breeding
. Rijeka, Croatia: InTech. pp. 131–184.
Meireles, K.G.X.
Aplicações da Proteômica na Pesquisa Vegetal
. Campo Grande, MS: Embrapa. Document 165. ISSN 1983-974X. September, 2007.
Morozova, O.; Hirst, M.; Marra, M.A. 2009. Applications of new sequencing technologies for transcriptome analysis.
Annual review of genomics and human genetics
,
10
:135–151.
Munns, R.; James, R.A.; Sirault, X.R.
et al
. 2010. New phenotyping methods for screening wheat and barley for beneficial responses to water deficit.
Journal of Experimental Botany
,
61
(13): 3499–3507.
Nagel, K.A.; Putz, A.; Gilmer, F.;
et al
. 2012. GROWSCREEN-Rhizo is a novel phenotyping robot enabling simultaneous measurements of root and shoot growth for plants grown in soil-filled rhizotrons.
Functional Plant Biology
,
39
(11): 891–904.
Rafalski, J.A., Tingey, S.V., Williams, J.G.K., 1991. RAPD markers, a new technology for genetic mapping and plant breeding.
AgBiotech News and Information
,
3
: 645–648.
Souza, A.J.; Mendes, B.M. J.; Filho, F.A.A.M. 2007. Gene silencing: concepts, applications, and perspectives in wood plants.
Scientia Agricola (Piracicaba, Braz.)
,
64
: 645–656.
Tisné, S.; Serrand, Y.; Bach, L.;
et al
. 2013. Phenoscope: an automated large-scale phenotyping platform offering high spatial homogeneity.
The Plant Journal
,
74
(3): 534–544.
Antônio Costa de Oliveira, Luciano Carlos da Maia, Daniel da Rosa Farias and Naciele Marini
Department of Crop Science/Eliseu Maciel School of Agronomy-FAEM, Federal University of Pelotas, Pelotas, RS, Brazil
The constant technological advances of modern societies, evident in, for example, the use of synthetic oil based commodities, still requires many products from plants in general domestic use. Since early civilizations, men have tried to adapt plants to their needs. Improved techniques have evolved significantly since the 19th century, when directed crossings were developed. These were followed by the rediscovery of the genetic principles that had been established by Mendel early in the 20th century, reaching the green revolution in the 1960s (Borlaug, 1983; Allard, 1971) and finally the gene revolution of 1990–2000s, as discussed in Bologna (Tuberosa, personal communication). Thus, throughout the whole of the 20th century, advances in genetics allowed systematic initiatives towards the increase in food production and vegetable fibers, in addition to other edible/inedible byproducts. However, there is still a great need to improve yields in order to cope with rises in food demand without increasing the areas of land that are cultivated. This balance between food supply and food demand has been maintained by breeders to date, but population increases still present a challenge (Food and Agricultural Organization of the United Nations (FAO), 2010). For practical purposes, the focus of this chapter will be restricted to the description and use of genomics in plant breeding.
Plant genomics consists of the development of large-scale analyses of structural and functional features of genomes, allowing the discovery of evolutionary and functional dynamics in plants.
DNA sequencing was first performed in the 1970s with two different techniques, the Sanger chemistry using dideoxy chain termination (Sanger, Nicklen, and Coulson, 1997), and also chemical degradation (Maxam and Gilbert, 1977). These pioneering efforts brought about spectacular results but at a slow pace, since the sequencing was performed manually and less than 200 bases were produced during the reading of four lanes. An automatic sequencer was released in the 1980s, using the Sanger chemistry (Connell et al., 1987), allowing more significant advances with a high throughput. DNA sequencing represents a powerful tool for the identification of a wide range of biological phenomena through the collection of large sets of data samples. This strategy starts with DNA extraction, which can be at the whole genome level (Shotgun sequencing) or based on smaller fractions cloned in BACs (Bacterial Artificial Chromosomes or YACs (Yeast Artificial Chromosomes). These large vectors are subsequently fragmented into smaller plasmid/cosmid libraries by means of enzyme digestion or other fractionation methods. The ends of smaller clones are then sequenced and read in the sequencing platforms. The data are generated in FASTA format and bioinformatic algorithms are used to build contigs and scaffolds, therefore reconstructing the original DNA molecule (chromosome). This task is hierarchical, as it goes from smaller to larger sequences, and is performed with the aid of assembly software, generating contigs (generated from overlapping sequence reads) and scaffolds (generated from the ordering of overlapping contigs) (Figure 2.1).
Figure 2.1Basic scheme, representing the process of plant DNA sequencing.
The first genome sequencing projects were performed using automatic Sanger sequencing chemistry (Figure 2.2). In the period between 1995 and 2005, the genomes of Arabidopsis (Arabidopsis Genome Initiative, 2000) and rice (International Rice Genome Sequencing Project, 2005) were completed.
Figure 2.2 Sanger automatic sequencing. The process initiates with a PCR in which, besides DNA, primers and deoxynucleotides, dideoxyribonucleotides (ddNTP) are added, which are labeled with fluorescent markers and without hydroxils at the 3′-position. Thus, every time a ddNTP is added to the chain, its extension is interrupted and, after numerous cycles, many different size fragments will be generated, which have been terminated by different ddNTPs. These fragments are later subjected to electrophoresis and separated into their different sizes. The fragments pass through a laser and the fluorescence is detected; this signal is then transformed into a chromatogram (a graph with peaks of different colors), each colored peak being attributed to a different base. The sequencer comes with software that will convert chromatograms into FASTA format sequences.
Since 2005, a new generation of technologies has emerged (Varshney et al., 2009), allowing the generation of an even larger amount of data per sample by reaction and revolutionizing the genomics landscape (Table 2.1). These technologies include the advances made in nanobiology and robotics and contributed to one of the goals in the world of science, that is, to perform the sequence of any human genome for a price of under US $1000 dollars (Thudi et al., 2012).
Table 2.1 Sanger and NGS sequencing platforms.
Year
Technology
Read length (bases)
Bases per run
1977
Sanger
1000
100 Kb
2005
454 (Life Science/Roche Diagnostics)
500
500 Mb
2005
ABI SOLID (Life Technologies)
50
30 Gb
2007
Illumina Genome Analyser (Solexa)
150
300 Gb
2010
Helicos (Helicos Biosciences)
55
35 Gb
2010
Ion Torrent (Life Technologies)
200
1 Gb
2010
SMRT (Pacific Biosystems)
2000
100 Mb
All these technologies perform DNA sequencing on platforms capable of generating several million base outputs in a single run. Among the NGS (Figure 2.3), two are in use worldwide: the Roche 454 FLX and Solexa/Illumina.
Figure 2.3NGS clone amplification: (a) emulsion PCR, used by 454, Polonator, and ABI SOLID platforms; (b) bridge amplification used by Illumina; (c) single molecule sequencing used by Helicos; and (d) real time sequencing, used by SMRT. (Source: Adapted from Metzker, 2009; Shendure and Ji, 2008).
The strategy used by 454, Polonator, and ABI SOLID consists of an emulsion PCR (Dressman et al., 2003). The DNA fragments are linked to adaptors, which are linked to magnetic microbeads by means of pairing between the adaptors and complementary oligonucleotide sequences present in the bead surface. Only one type of fragment pairs with one bead. The microbeads are individually captured in oil drops, where the emulsion PCR occurs. Thousands of copies from the target fragment are produced and the microbeads containing the target sequences are subsequently captured in wells where the pyrosequencing occurs. Each base is identified by means of specific chemiluminescence emission. In the Illumina platform, adaptors are linked to both ends of a DNA fragment. DNA molecules are then attached to a solid matrix containing oligonucleotides complementary to the adaptors. Thus, the fragment free adaptor end is linked to the complementary oligonucleotide in the matrix and a bridge structure is formed. Next, a PCR reaction is performed, with the addition of labeled nucleotides. After a series of annealing cycles, bridge formation, amplification, and denaturation, clusters of identical molecules are formed attached to the matrix, and the nucleotide reading is performed sequentially through the signal emitted by a laser beam. The single molecule sequencing of the Helicos platform is initiated with the addition of a polyA tail into DNA fragments. These fragments are then hybridized to a solid matrix that contains thousands of adhered polyT fragments, and in each cycle a unique fluorescence labeled fragment complementary to the template DNA is added, a camera then captures this fluorescence and an image is produced for the nucleotide reading. Real time sequencing, performed by SMRT, is obtained by individual molecules of DNA polymerase adherent to the bottom of a surface, known as a Zero Mode Waveguide (ZWM), which performs the reading of each base added to the template strand of DNA. These nucleotides are marked with a fluorescent label and the readings are taken by detecting each incorporated nucleotide fluorescence signal (Metzker, 2009; Shendure and Ji, 2008).
Owing to reductions in cost and an increase in sequencing capacity, the new platforms are efficient for routine use in sequencing and resequencing of individual genomes, for detecting variations between target and reference genomes (Service, 2006). In the last few years, NGS technologies have rapidly evolved, with the potential to accelerate the discoveries in biological and medical research, stimulating gene variation studies, understanding of biotic and abiotic stress responses and evolutionary studies (Shendure and Ji, 2008). Resequencing whole genomes is an essential tool for the characterization of genetic variations in many contexts (Bentley, 2006). These sequencing platforms are being used in large plant genome sequencing projects such as IOMAP – International Oryza Map Genome Initiative, which aims to sequence all species from the Oryza genus, the “1000 Plant Genomes Project” (www.onekp.com/), the “1001 Arabidopsis Genome Project” (www.1001genomes.org/), and the “1000 Plant and Animal Genome Project” (www.1d1.genomics.cn/). Additionally, the “Genome 10K Project” was created to sequence and assemble 10 000 vertebrate genomes, including at least one from each genus (www.genome10k.org/).
As a result, the new sequencing platforms generate short reads, which are smaller than those generated by traditional Sanger sequencing, and the assembly of these reads into contigs has become one of the largest challenges for these technologies (Figure 2.4). Therefore, bioinformatics has developed into a key tool for biological, genetic, and plant breeding studies.
Figure 2.4Reads generated by NGS. (a) Single-end reads, where the reading is performed only on one end (indicated by the blue arrow). (b) Paired-end reads, where the sequencing is performed in both fragment ends, in opposite directions (blue arrows). (c) In the mate-pair reads, biotin-labeled nucleotides are linked to both ends of the fragment, which is circularized and cut in small sequences that are selected by biotin-based rescue and moved to sequence reading (single by 454 and paired by Illumina and SOLID. (Source: Adapted from Hamilton and Buell, 2012).
During the last two decades, the major advances in genome technology have lead to an increase in the amount of biological information generated by the scientific community. Thus, storage of biological data in public databases is becoming more common every year, with an exponential growth in size (Figure 2.5).
Figure 2.5Genome size of cultivated species. Published data (gray) and four economically important species (orange).
The treatment of genetic information began to increase in scale from the 1980s when the first markers were generated (Botstein et al., 1980). With the development of automated sequencing technologies, new markers were made available with larger outputs. Therefore, bioinformatic tools became part of genomic research and a routine procedure in genomics and breeding laboratories.
Currently, the combination of novel genomics insights with traditional breeding methods is seen as essential to achieving progress in agriculture. Genome access became a fundamental resource in biological sciences and, although the rate of sequencing in plants is slower compared with microbial and mammal systems, genomics has been largely applied in agronomy, biochemistry, forest sciences, genetics, horticulture, plant pathology, and systematics (Hamilton and Buell, 2012). Therefore, the term molecular breeding was coined to describe the integration of classic breeding and molecular biology, for example, molecular markers (Xu, Li, and Thomson, 2012).
Besides the contribution given by molecular markers to plant breeding advances, the increase in availability of data referring to sequenced genomic regions and transcriptome analyses have also played an important role (Varshney et al., 2009). Complete or partial genomes are available for many species, such as Arabidopsis (Arabidopsis Genome Initiative, 2000), rice (International Rice Genome Sequencing Project, 2005), ice-plant (Tuskan, et al., 2006), grape (Jaillon, et al., 2007), papaya (Ming et al., 2008), sorghum (Paterson, et al., 2009), Brachypodium (Vogel et al., 2010) and cocoa (Argout et al., 2011).
The understanding of reference genomes, which are available for several plant species, and the high throughput nature of sequencing projects have provided new opportunities for plant breeding, such as comparative genomics. These tools became very useful thanks to the new experimental and bioinformatics approaches for data treatment. The future of crop improvement will be centered on comparisons of individual plant genomes, and some of the best opportunities may lie in using combinations of new genetic mapping strategies and evolutionary analyses to direct and optimize the discovery and use of genetic variation (Morrel, Buckler, and Ross-Ibarra, 2012).
Molecular markers are defined as genetic markers that are based on a pool of techniques that can detect DNA polymorphism in a single locus or in the whole genome (Oliveira, 1998; Varshney, Graner, and Sorrells, 2005) and can be detected by Polymerase Chain Reaction (PCR) methods or other procedures that combine the use of restriction enzymes and hybridization techniques between complementary DNA sequences.
Among these molecular markers, Restriction Fragment Length Polymorphism (RFLP) was the first DNA marker to be used in plant breeding (Figure 2.6).
Figure 2.6Scheme representing RFLP. The DNA is extracted and fragmented and the fragments are separated by electrophoresis in agarose gels, where a continuous smear is seen. The fragments are denatured and transferred to nylon or nitrocellulose membranes, followed by hybridization with a radioactive fluorescent probe containing a single-strand target fragment. Autoradiography, in which the membrane hybridized with the labeled probe is exposed to an X-ray film.
After RFLP, a number of PCR-based markers were generated (Figure 2.7), as well as the Amplified Fragment Length Polymorphism (AFLP) technique. In this technique, the DNA is fragmented with two restriction enzymes, one rare and one frequent cutter, followed by linking adaptors with known sequences linked to both fragment ends. A selective amplification of fragments is next performed with the use of primers complementary to the adaptor's sequences.
Figure 2.7 Molecular marker depictions: (a) AFLP; (b) SSR; (c) SNP; (d) ISSR; and (e) DArT.
The Simple Sequence Repeat (SSR) or microsatellite technique consists in amplifying regions that contain simple tandem repeats, with the aid of primers complementary to the unique regions that flank the repeat region.
