173,99 €
This comprehensive reference book discusses the convergent and next-generation technologies for product-derived applications relevant to agriculture, pharmaceuticals, nutraceuticals, and the environment.
The field of modern biotechnology is a multidisciplinary and groundbreaking area of biology that includes several cutting-edge methods due to developments in forensics and molecular modeling. Bioinformatics is a full-fledged multidisciplinary field that combines advances in computer and information technology. Numerous applications of bioinformatics—primarily in the areas of gene and protein identification, structural and functional prediction, drug development and design, folding of genes and proteins and their complexity, vaccine design, and organism identification—have contributed to the advancement of biotechnology. Biotechnology is also essential to crop improvement in agriculture because it allows genes to transfer across plants to increase traits such as disease resistance and yield. It also plays a broad role in healthcare, including genetic testing, gene therapy, pharmacogenomics, and drug development. Bioremediation and biodegradation, using microbial technologies to clean up environmental contamination, waste management technologies, and the conversion of organic waste to biofuels. Bioinformatics plays a critical role in analyzing different types of data created by high-throughput research methods—such as genomic, transcriptomic, and proteomic datasets—that are useful in addressing various problems related to disease management, clean environment, alternative energy sources, agricultural productivity, and more.
Audience
The book will interest biotechnology researchers and bioinformatics professionals working in the areas of applied biotechnology, bioengineering, biomedical sciences, microbiology, agriculture and environmental sciences.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 733
Veröffentlichungsjahr: 2024
Cover
Table of Contents
Series Page
Title Page
Copyright Page
Preface
Part I: AGRICULTURE
1 Next-Generation Sequencing in Vegetable Crops
1.1 Introduction
1.2 Next-Generation Sequencing Approach in Genomics
1.3 NGS Approach in Single-Nucleotide Polymorphic Markers Development
1.4 Next-Generation Sequencing Approach in Trait-Specific Breeding
1.5 Next-Generation Sequencing Approach in Metagenomics
1.6 Next-Generation Sequencing Approach in Transcriptomics
1.7 Next-Generation Sequencing Approach in Exome and Captured Sequencing
1.8 Applications of Exome and Captured Sequencing in Crop Research
1.9 Conclusion and Future Prospects
References
2 Application of Bioinformatics Tools in Rice Genomics Research
2.1 Introduction
2.2 Role of Genomics in Rice Research
2.3 Model Plant for Genomic Research: Rice
2.4 High-Throughput Sequencing
2.5 Genome-Wide Association Study (GWAS)
2.6 Bioinformatics Approach to Study Stress Conditions in Rice
2.7 Application of Bioinformatics Tools in Advanced Rice Genomics Research
2.8 Current Challenges of Bioinformatics Tools for Rice Genomics Research
2.9 Conclusion
Conflict of Interest
References
3 Computer-Aided Vaccine Design: Applications in Agriculture
3.1 Introduction
3.2 Agriculturally Important Animals
3.3 Diseases Affecting Animal Health in Agriculture
3.4 Vaccination in Agriculture
3.5 Vaccine
3.6 Intervention of Computer in Vaccine Designing
3.7
In Silico
Vaccine Designing: Agricultural Applications
3.8 Conclusion and Future Prospects
References
4 Genomics to Phenomics: A Paradigm Shift in Crop Science Research
4.1 Introduction
4.2 Genomics in Crop Improvement
4.3 Advances in Genomics-Assisted Breeding
4.4 Phenotyping
4.5 Phenomics
4.6 Phenomics Approaches in Crop Improvement
4.7 Conclusion
References
Part II: PHARMACEUTICAL RESEARCH
5 Molecular Modeling and Drug Development
5.1 Introduction
5.2 Structure-Based Drug Design
5.3 Docking
5.4 Ligand-Based Drug Design
5.5 Pharmacophore
5.6 QSAR
5.7 Virtual Screening
5.8 Pharmacophore-Based VS
5.9 Similarity-Based VS
5.10 Homology Modeling and Protein Folding
5.11
In Silico
Pharmacokinetics
5.12 Conclusion
References
6 Comparative Study on Tannase Sequence and Structure of
Lactiplantibacillus
: An
In Silico
Protein Variability Analysis and Its Impact on Microbial Speciation
6.1 Introduction
6.2 Materials and Methods
6.3 Results and Discussion
6.4 Conclusion
References
7 Probiotics: A Novel Natural Therapy for Oral Health
7.1 Introduction
7.2 Background
7.3 Mechanism in Oral Diseases Prevention by Probiotics
7.4 Probiotic Formulation
7.5 Prevention and Oral Health Management
7.6 Concluding Remarks
7.7 Future Aspects
References
8 The Preventative and Curative Functions of Probiotics: A Paradigm of Food as Drug Revolution
8.1 Introduction
8.2 Criteria for Choosing Probiotics and the Bare Minimum Needed
8.3 Action Mechanism of Probiotics
8.4 Probiotics in the Clinical Practice: A Growing Trend
8.5 Potential Preventative Roles of Probiotics
8.6 Therapeutic Use of Probiotics
8.7 Recent Advancement in Probiotics
8.8 Conclusion and Recommendation
Acknowledgments
References
9 Probiotics in the Prevention and Treatment of Psoriasis
9.1 Introduction
9.2 Interruption of the Microbiome: A Pathogenic Effect in Psoriasis
9.3 Therapeutic Effect of Probiotics for Psoriasis
9.4 Conclusion
References
10 A Gateway to Multi-Omics-Based Clinical Research
10.1 Introduction
10.2 Importance of Multi-Omics
10.3 Genomics and Relevant Clinical Studies Along with Its Tools and Methods
10.4 Proteomics and Relevant Clinical Studies Along with Its Tools and Methods
10.5 Sample Type and Acquisition
10.6 Various Data Acquisition Methods for Proteomics Data Include the Following
10.7 Techniques Used in Clinical Proteomics
10.8 Analysis Tools in Clinical Proteomics
10.9 Metabolomics and Relevant Clinical Studies Along with Its Tools and Methods
10.10 Different Types of Metabolomics
10.11 Techniques and Tools Used in Metabolomics
10.12 Metabolite Databases
10.13 Data Analysis Tools and Software
10.14 Application of Metabolomics in Clinical Studies
10.15 Conclusion
References
11 Inherent Observation of Mucosal Non-Specific Immune Parameters in Indian Major Carps
11.1 Introduction
11.2 Materials and Methods
11.3 Results and Discussion
11.4 Conclusion
References
Part III: ENVIRONMENT
12 Eco-Friendly Approaches for Converting Organic Waste to Bioenergy for Sustainable Development
12.1 Introduction
12.2 Organic Waste in the Bioenergy Generation
12.3 Categories and Characteristics of Organic Waste
12.4 Organic Waste Based on Origin
12.5 Organic Waste Based on the State of Matter
12.6 Organic Waste Based on the Level of Production
12.7 Characteristics of Organic Waste
12.8 Greenhouse Gases (GHGs)
12.9 Benefits of Organic Waste
12.10 Current and Prospective Use of Organic Waste
12.11 Sustainable Bioenergy and Biofuels from Organic Waste
12.12 Conversion of Organic Waste into Bioenergy and High-Valued Products
12.13 Biofuels from Organic Waste: Biochemical and Thermochemical Processes
12.14 Fermentation
12.15 Anaerobic Digestion
12.16 Combustion
12.17 Pyrolysis
12.18 Gasification
12.19 Biorefinery Concept Based on Organic Waste for Clean Energy Management
12.20 Success and Challenges of Organic Waste for Bioenergy
12.21 Conclusion and Recommendations
References
13 Utilization of Food Waste for Bioenergy Production
13.1 Introduction
13.2 Potential of Food Waste for Bioenergy Production
13.3 Bioenergy from Food Waste
13.4 Conclusion
References
14 Photosynthetic Microalgal Microbial Fuel Cell (PMMFC): A Novel Strategy for Wastewater Treatment and Bioenergy Generation
14.1 Introduction
14.2 Microbial Fuel Cell
14.3 Types of PMFC
14.4 Role of Algae in PMFC
14.5 Conclusion
References
15 Self-Cleaning Aquarium: The Microbial Biofilm Approach for Ammonia Bioremediation
15.1 Current Scenario of Fresh Water Scarcity and Impact of Aquaculture
15.2 Existing Technologies for Aquaculture Effluent Treatment for Environmental Sustenance
15.3 The Novel Rapid Biofilm Reactor-Based Ammonia Removing System
15.4 The Case Study of the Self-Cleaning Aquarium
15.5 Conclusion and Future Application
Acknowledgments
References
16 Metagenomics Unveiled: Deciphering Microbial Responses to Climate Change
16.1 Introduction
16.2 Climate Change and Its Impact on the Environment and Microbiome
16.3 Metagenomics as a Tool for Climate Change Research
16.4 Microbial Adaptation to Climate Change
16.5 Feedback Loops and Climate Change
16.6 Metagenomics in Climate Change Mitigation
16.7 Case Studies and Research Findings
16.8 Metagenomic Climate Model Frame
16.9 Challenges and Future Directions
16.10 Conclusion
Acknowledgments
Author Contributions
Conflict of Interest
References
17 Biosensor: A Tool for Assessment of Soil Pollutants
17.1 Introduction
17.2 Working Principles
17.3 Types of Biosensors
17.4 Application of Biosensors
17.5 Advantages, Disadvantages, and Adoption of Biosensors
17.6 Ethical Considerations and Future Challenges
17.7 Conclusion
References
18 Transcriptome-Guided Characterization of Molecular Resources in Mussels
18.1 Introduction
18.2 Species of Mussels Sequenced at the Transcriptome Level
18.3 Transcriptome Pipeline for Mussel Molecular Resources
18.4 Mussel Transcriptome Assembly and Annotation
18.5 Conclusions and Future Perspectives
Acknowledgments
References
Index
End User License Agreement
Chapter 1
Table 1.1 Genomics studies on different vegetable crops using sequencing techn...
Table 1.2 List of markers used for studying different cultivars of vegetable c...
Table 1.3 Different NGS techniques used for targeted breeding of vegetable cro...
Chapter 2
Table 2.1 Widely used bioinformatics databases and resources for rice genomics...
Table 2.2 A list of application of advanced CRISPR/Cas tools in rice.
Table 2.3 Popular CRISPR guide RNA designing tools in rice.
Table 2.4 The servers used to identify the Acr proteins and their specificatio...
Table 2.5 List of
in silico
tools available to find most probable offtarget si...
Chapter 3
Table 3.1 Major tools for sequence-based epitope prediction.
Table 3.2 Major tools for structure-based epitope prediction.
Chapter 4
Table 4.1 Applications of high-throughput phenotyping integrated with GWAS in ...
Chapter 6
Table 6.1 Name of target organisms with their UniProt ID and similarity of tan...
Table 6.2 Amino acid distribution of tannase from
Lactiplantibacillus
sp. in t...
Table 6.3 Intra-protein interaction in tannase from species of
Lactiplantibaci
...
Chapter 7
Table 7.1 Various oral probiotic supplements have medicinal benefits. Sourced ...
Chapter 8
Table 8.1 Potential anti-cancer effects of probiotics.
Chapter 9
Table 9.1 Evidence of beneficial probiotic interventions in treating psoriasis...
Table 9.2 Clinical trial of probiotics for the treatment of psoriasis (https:/...
Chapter 11
Table 11.1 Lysozyme activities of epidermal mucus extracts of
L. rohita, C. ca
...
Table 11.2 Alkaline phosphatase activities of epidermal mucus extracts of
L. r
...
Table 11.3 Protease activities of epidermal mucus extracts of
L. rohita, C. ca
...
Chapter 12
Table 12.1 Different sources of organic waste (food waste) and their potential...
Table 12.2 Organic waste as a potential source of bioenergy.
Table 12.3 Organic wastes as a potential source of other high-valued products.
Chapter 13
Table 13.1 Source of food waste for generating energy.
Table 13.2 Production of bioenergy from food wastes using different methodolog...
Chapter 15
Table 15.1 Average concentration of ammonia in the bioreactors in mg/L.
Table 15.2 Statistical validation of the data (
p
value) showing significant va...
Table 15.3 Protocol for COD measurement.
Table 15.4 Average concentration of ammonia in the bioreactors with 2-fold dil...
Table 15.5 Statistical validation of the data showing significant variation wi...
Table 15.6 Chemical oxygen demand (COD) in mg/L at different time point in the...
Table 15.7 Percentage reduction of COD(%) at different time point in the three...
Table 15.8 Water quality before and after 3 months of treatment in the commerc...
Chapter 16
Table 16.1 Glimpses of some recently studied metagenomics sequencing approache...
Chapter 17
Table 17.1 Commonly used biosensors for heavy metal detection.
Table 17.2 Advantages and disadvantages of biosensors while detecting soil pol...
Chapter 18
Table 18.1 Representation of mussel genomes in the public domain.
Table 18.2 A methodological summary of Next-Generation Sequencing (NGS) studie...
Table 18.3 Comparison of mussel transcriptome assembly and annotation.
Table 18.4 Summary of mussel transcriptomics in the last 5 years (2019–2023)* ...
Chapter 1
Figure 1.1 Timeline of next-generation sequencing technology.
Chapter 3
Figure 3.1 The general steps involved in vaccine designing.
Figure 3.2 Basic steps to develop a vaccine through computer-aided vaccine des...
Chapter 6
Figure 6.1 3D structure of template tannase protein, i.e.,
Lactiplantibacillus
...
Figure 6.2 Amino acid abundance in tannase of
Lactiplantibacillus plantarum
(r...
Figure 6.3 Ramachandran plot showing amino acid distribution of tannase belong...
Figure 6.4 Ramachandran plot showing amino acid distribution of tannase belong...
Figure 6.5 (a) RMSD and (b) RMSF plots of tannase of
Lactiplantibacillus plant
...
Figure 6.6 (a) Rg and (b) SASA of tannase of
Lactiplantibacillus plantarum
(re...
Figure 6.7 Hydrogen bonds in tannase of
Lactiplantibacillus plantarum
,
L. pent
...
Chapter 7
Figure 7.1 Mechanistic actions of probiotics against dental caries.
Chapter 8
Figure 8.1 Schematic depiction of the numerous functions of probiotics.
Figure 8.2 Changes in the biology of cancer cells were triggered by
Lactobacil
...
Figure 8.3 Intestinal inflammation can be triggered by bacteria in the gut and...
Figure 8.4 A schematic diagram exhibiting role of probiotics for hypertension ...
Figure 8.5 The potential mode of action of the probiotic for the treatment of ...
Figure 8.6 A schematic depicting the therapeutic potential of genetically engi...
Figure 8.7 Probiotics’ Nobel mechanisms of action in antioxidation Figure adop...
Chapter 9
Figure 9.1 Gut–skin axis microbiota.
Chapter 10
Figure 10.1 The relationship between numerous clinical samples and various dom...
Figure 10.2 This figure depicts top-down and bottom-up approaches in proteomic...
Figure 10.3 Describing about processes required to perform metabolic analysis ...
Chapter 11
Figure 11.1
SDS-PAGE analysis of mucus extracts of C. catla, L. rohita, and C.
...
Chapter 12
Figure 12.1 Organic waste in the production of different biofuels for a health...
Figure 12.2 Sources of organic waste based on origin.
Chapter 13
Figure 13.1 Schematic diagram of biochemical process for the production of bio...
Figure 13.2 Biofuel production from industrial, agriculture and home food wast...
Chapter 14
Figure 14.1 Schematic diagram of MFC.
Figure 14.2 Schematic diagram of photosynthetic microbial fuel cell.
Figure 14.3 (a) Dual-chambered PMFC, (b) tubular PMFC, (c) single-chambered ai...
Figure 14.4 Mechanism of carbon sequestration.
Chapter 15
Figure 15.1 Pictures of aquarium with raschig rings. Pictures in the two rows ...
Chapter 16
Figure 16.1 Pictorial representations of causes for climate change and its con...
Figure 16.2 Schematic diagram of climate change mitigation strategies through ...
Figure 16.3 A conceptual model frame to understand microbiome functioning and ...
Chapter 17
Figure 17.1 Flowchart showing the mechanism of the working principle of biosen...
Figure 17.2 Type of biosensors commonly used to detect contaminants in soil en...
Chapter 18
Figure 18.1 An overview of the transcriptome initiatives involving mussels. Il...
Cover Page
Table of Contents
Series Page
Title Page
Copyright Page
Preface
Begin Reading
Index
WILEY END USER LICENSE AGREEMENT
ii
iii
iv
xvii
xviii
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
437
438
439
440
441
442
Scrivener Publishing100 Cummings Center, Suite 541JBeverly, MA 01915-6106
Publishers at ScrivenerMartin Scrivener ([email protected])Phillip Carmical ([email protected])
Edited by
Hrudayanath Thatoi
Center for Industrial Biotechnology Research, Siksha ‘O’ Anusandhan University, Bhubaneswar, Odisha, India
Sonali Mohapatra
Dept. of Biological Systems Engineering, University of Wisconsin, Madison, USA
Swagat Kumar Das
Dept. of Biotechnology, Odisha University of Technology and Research, Bhubaneswar, Odisha, India
and
Sukanta Kumar Pradhan
Dept. of Bioinformatics, Odisha University of Agriculture and Technology, Bhubaneswar, Odisha, India
This edition first published 2025 by John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA and Scrivener Publishing LLC, 100 Cummings Center, Suite 541J, Beverly, MA 01915, USA© 2025 Scrivener Publishing LLCFor more information about Scrivener publications please visit www.scrivenerpublishing.com.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
Wiley Global Headquarters111 River Street, Hoboken, NJ 07030, USA
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Limit of Liability/Disclaimer of WarrantyWhile the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchant-ability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials, or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read.
Library of Congress Cataloging-in-Publication Data
ISBN 978-1-119-89640-1
Front cover images courtesy of Wikimedia CommonsCover design by Russell Richardson
The field of modern biotechnology is a multidisciplinary, groundbreaking, and advantageous area of biology that includes several cutting-edge methods. The discipline of biotechnology is expanding at a rapid, previously unseen pace due to developments in forensics, molecular modeling, clinical healthcare, pharmaceuticals, agriculture, environmental bioremediation, renewable energy, and many other areas. In biological science, bioinformatics is an advanced field of research that has grown to an unprecedented height.
Presently, bioinformatics, a full-fledged multidisciplinary field that combines advances in computer and information technology, has made great progress in its applications to the field of biotechnology and biological sciences. Numerous applications of bioinformatics—primarily in the areas of gene and protein identification, structural and functional prediction, drug development and design, folding of genes and proteins and their complexity, vaccine design, and organism identification—have contributed to the advancement of biotechnology. Bioinformatics play a critical role in analyzing different types of data created by high-throughput research methods—such as genomic, transcriptomic, and proteomic datasets—that will be useful in addressing various problems related to disease management, clean environment, alternative energy sources, agricultural productivity, and more.
Biotechnology is essential to crop improvement in agriculture because it allows genes to transfer across plants to increase traits such as disease resistance and yield. Biotechnology plays a broad role in healthcare, including genetic testing, gene therapy, pharmacogenomics, and drug development. Bioremediation and biodegradation, using microbial technologies to clean up environmental contamination, are critical in the current scenario. In this context, bioinformatics employs tools to analyze and mechanize different degradation pathways that help in microbial applications.
This book is comprised of eighteen unique chapters, each written by renowned researchers in the fields of microbiology, bioinformatics, agriculture, food, pharmaceuticals, and bioremediation. It provides an in-depth discussion on emerging topics while concentrating on the most recent research about the applications of biotechnology and bioinformatics in agriculture, nutraceuticals, pharmaceuticals, and the environment. General readers, scholars, biotechnology researchers, and bioinformatics professionals working in the areas of applied biotechnology, bioengineering, biomedical sciences, microbiology, and environmental sciences will come away from this book with a wider understanding of recent innovations, tools, techniques, and applications.
The editors express their gratitude to the esteemed writers for providing chapters in this book that showcase their superb work and extensive expertise in their respective fields of research. Finally, the editors thank Martin Scrivener and Scrivener Publishing for their assistance and publication of this book.
Hrudayanath Thatoi
Sonali Mohapatra
Swagat Kumar Das
Sukanta Kumar Pradhan
Meenu Kumari1*, Tanya Barpanda2, Meghana Devireddy3, Ankit Kumar Sinha3, R. S. Pan1 and A. K. Singh1
1ICAR-Research Complex for Eastern Region, RS, Ranchi, India
2Orissa University of Agriculture & Technology, Odisha, India
3ICAR-Indian Agricultural Research Institute, New Delhi, India
The last few decades have witnessed revolutionary advances in all biological disciplines in DNA sequencing technologies at a fraction of the cost with respect to traditional sequencing. There are two methods of crop breeding, i.e., conventional approach through hybridization followed by selection and marker-assisted selection (MAS). Limitations of conventional approach like long periods of selection to fix a trait in the breeding population, environmental effect, and low efficiency for complex and less heritable traits lead breeders to choose MAB. It necessitates the use of different molecular markers based on the availability of information on linkage of traits with markers. However, MAB efficiency was convenient to explore traits that are governed by few numbers of quantitative trait loci (QTLs), whereas for complex traits like yield, quality, biotic stress, and abiotic stress, which are governed by a large number of minor QTLs, Marker-Assisted Selection is not effective. The next-generation sequencing (NGS) approach opened an era of data science where millions of bases are being sequenced in one round and extremely reduced the time and cost of sequencing. In this chapter, we describe the status, recent development, and application of NGS in vegetable crops for their utilization in practical improvement approaches.
Keywords: MAB, QTL, GBS, transcriptomics, GBS, NGS
The next-generation sequencing (NGS) approach has exemplary development in the field of life sciences and shifted sequencing studies from “model organism” to “every organism” with the power of high-throughput NGS technology. NGS techniques became commercially available in 2005, and since then, there has tremendous development at an astonishing rate in this field like the evolutionary process to address important and unexplored questions of plant systems. These approaches can be broadly divided into three main categories: sequencing by synthesis, sequencing by ligation, and single-molecule sequencing. However, emerging technologies made it possible without library preparation, like the use of quantum dot (qdot)-derived fluorescence resonance energy transfer (FRET) to detect fluorescently labeled nucleotide incorporation. Another sequencing approach is nanopore sequencing, where chemical or electronic properties of bases (DNA or RNA) can be analyzed directly while passing through nanopores. The NGS evolution trend with the advancement of technologies has been depicted in Figure 1.1. This has exemplary development in the field of life sciences and shifted sequencing studies from “model organism” to “pan organism” with the power of high-throughput NGS technology.
Figure 1.1 Timeline of next-generation sequencing technology.
With the advancement of high-throughput sequencing platforms, data have been generated for millions of plant species and, therefore, understanding of plant system at the nucleotide level has been enriched in horticultural crops like fruits, vegetables, spices, plantations, etc.
Among vegetables, the tomato was a pioneer in identifying the genetic basis of quantitative traits and in the map-based cloning of genes and quantitative trait loci (QTLs) [1]. Through the creation of extensive molecular marker libraries [2], genetic and physical maps [3], and mapping populations [4], tomato has served as a model plant for improvement and inheritance studies since the early 1990s. The Tomato Genome Consortium started the project called “SOL-100” associated with NGS-based sequencing of 100 different species of Solanaceae family and relating their sequences to the reference genome (solgenomic.net).
Completed genome
Draft genome
Projects based on resequencing
Solanum lycopersicumSolanum tuberosumCapsicum annuumSolanum melongenaSolanum lycopersicoidesSolanum pennelliiSolanum pimpinellifoliumSolanum lycopersicum
var.
cerasiforme
Iochroma cyaneum
Nicotiana attenuataNicotiana benthamianaNicotiana tabacumPetunia axillarisPetunia inflataSolanum chilenseCoffea humblotiana
Solanum lycopersicum
inbreds150 Tomato Genome Resequencing ProjectBGI Tomato 360 genomesVaritome Project
The tomato cultivar “Heinz 1706” is estimated to have a genome size of approximately 900 Mb with a simple structure composed of two main components: pericentromeric heterochromatin with repetitive sequences, occupying 75% of the whole genome, and distal euchromatin comprising the remaining 25% (220 Mb). It was sequenced through the BAC-by-BAC approach, resulting in the sequencing of 117 Mb of euchromatic regions precisely [5]. The main reason behind choosing this particular cultivar was because of its well-characterized HindIII BAC library available at that time [6]. In 2008, 30,800 BAC clones were selected, pooled, and short gun sequencing was done using the Sanger method of sequencing to accelerate the sequencing progress (The Tomato Genetic Consortium.2012). The sequences were congregated into contigs that elucidated 540 Mb of the genome. In 2009, the emerging NGS platforms paved the way for the sequencing consortium to plan for whole genome sequencing, which was earlier confined to only sequencing of euchromatin regions. Different variants of NGS technologies like 454/Roche GS FLX, Illumina Genome Analyser, and SOLiD sequencing were used. Using Sanger data, a de novo sequencing of “Heinz 1706” was assembled. Assemblies were generated by independent programs like Newbler and CABOG, and merged later on. Read-mapping and base error correction resulted in more accurate data where one base calling error per 29.4 kb and one indel error per 6.4 kb were obtained [7, 8]. Two BAC-based physical maps were used to connect the resulting high-quality scaffolds, and a high-density genetic map was used to anchor them [9] as well as with introgression line mapping and genomewide BAC-FISH (Fluorescence In Situ Hybridization). The final assembly of tomato genome consisted of 760 Mb involving of 91 scaffolds. Most of the gaps identified after aligning the scaffolds with 12 chromosomes were confined to pericentromeric regions [7]. The consortium also sequenced the LA1589 accession of wild S. pimpinellifolium to explore the diversity it possesses through variation with the reference genome of “Heinz 1706”. It was performed by Illumina technology using the whole-genome short gun sequencing approach. An assembly of 739 Mb was congregated, which, when compared with reference genome, showed the nucleotide divergence of only 0.6%, indicating the high level of similarity among two species (The Tomato Genome Consortium 2012). The first resequencing in tomato was attempted on 4 S. lycopersicum and 4 S. l. cerasiformae genotypes through the Illumina GAIIx platform, which generated a total of 4 million unique SNPs, 1,686 putative copy-number variations (CNV), and almost 1,28,000 InDels [10]. These variations can be utilized for QTL and gene mapping. Long-read technologies made further improvements in sequencing high-quality reference genomes including S. pimpinellifolium accessions like LA2093, LA1589 [11], LA1670 [12], S. l. var. cerasiformae acc. LA1673 [12], S. l cv. Moneyberg [13], S. l cv. Heinz 1706 [14], and S. lycopersicoides acc. LA2951 [15]. A total of 100 tomato accessions were sequenced using nanopore technology to detect variations and de novo assemblies were released for 14 reference genomes [16]. A total of 2,38,490 structural variants (mostly insertions and deletions) were discovered. The functional analysis allowed the linking of these variants to three major traits important for domestication and improvement targets: smoky flavor (not preferable among consumers), sb1 (for branching patterns), and fw3.2 (major QTL for fruit mass).
Capsicum annuum cv. CM334 was sequenced with high genome coverage and is considered as a reference for genome sequence of pepper [17]. The genome size was estimated to be 3.48 Gb, which is very high when compared to its fellow crops of same family. An increase in the genome size of pepper is supposed to be because of long terminal repeats of retrotransposons. In addition, CM334 genome has 76.4% transposable elements. This study also provided deeper insights into pepper pungency causing capsaicinoid biosynthesis pathway. For a better understanding of evolution and domestication, another cultivar of C. annuum var. glabriusculum (Zunla-1 and wild Chiltepin) was sequenced using Illumina technology through the whole-genome short gun approach [18]. They found 1104 target genes responsible in the capsaicinoid biosynthesis pathway, which indicates miRNAs for regulation. Resequencing data of 20 genomes were compared and found that domestication happened through artificial selection. Phylogenetic analysis was done by comparing the genomes of tomato, potato, and Arabidopsis genomes with pepper, which led to the discovery of pepper-specific duplications in 13 gene families. A heterozygous F1 from a wide cross was used for sequencing and was assembled to assess the ability to derive both haplotypes [19]. A pungency gene at the PUN1 locus was derived with large insertion/deletions that facilitated marker-assisted selection for pepper improvement.
The potato being a highly heterozygous polyploid, sequencing is much difficult when compared to diploids because for any given gene in a genotype, four different alleles per locus will be depicted. Thus, genotyping systems should differentiate among alleles and must be quantifying the copy number of allele. In order to overcome this problem, double monoploids were used for easy sequencing. Such a unique doubled monoploid was used for sequencing high-quality draft genome of 844 Mb using a combination of methods like Sanger, Roche/454, and Illumina and de novo assembly was done and is considered as a potato reference genome [20]. This had provided understanding into the evolution of genome of eudicots. Additionally, 3.67 million SNPs and 275 gene-specific (presence/absence) alterations were identified, concluding that homozygous alleles in the double monoploids is the reason for reduced vigor. A total of 15,235 genes were found in their full expression in developing tubers, giving valuable insights in the study of evolution of tuber development. Evolutionary innovation of tuberization has been confined to only the Petota section of Solanum. Although tomato is a neighboring species, it did not acquire this trait. These insights ignite the urge to study the evolutionary pathway further, to draw a deeper understanding about this genus.
Eggplant sequencing stands out among Solanaceae because it is phylogenetically unique being indigenous to Old World, whereas other crops of this family like tomato, chili pepper, and potato originated in South America. First draft genome of eggplant cv. Nakate-Shinkuro was built using a HiSeq 2000 Illumina sequencer, resulting in 33,873 scaffolds that depicted 74% of the whole genome [21]. Also, the reference genome of inbred line “67/3” was sequenced through Illumina and de novo assembled [22]. An inbred line HQ-1315 was assembled using an amalgamation of Illumina, Nanopore, and 10x genomic sequencing technologies to produce high-quality reference genome of ∼1.17 Gb size that was assembled using Hi-C technology [23]. High genome acreage is associated with long terminal repeat (LTR) retrotransposons comprising 70.09% of the whole genome.
Okra is one of the major vegetable crops belonging to the Malvaceae family. Plastids and mitochondrial genome of okra were sequenced through a combination of Illumina and Nanopore NGS [24]. This study found that plastid genome is a bit more conserved whereas mitochondrial genome has subgenomic configurations. They also observed immense transfer of sequences between the organelles, for instance, the presence of plastid genes (psaA, rps7, and psbJ) in mitochondrial genome.
Brassica genome sequencing consortium was started in 2003, initially confined to sequencing of B. rapa diploid A genome through BAC approach and Sanger technique. With the advent of NGS techniques, many genome sequencing projects emerged. The International B. rapa Genome Sequencing Project Consortium released B. rapa (Chinese Cabbage) genome annotated with 41,174 protein coding genes [25]. This study provided new perception into the expansion of the Brassica lineage. Second sequencing attempts were initiated in B. oleracea diploid C genome using the combination of Roche/454 and Illumina. Later on, reference genomes for all three Brassica progenitors and allopolyploids species like B. napus and B. juncea were released [25–28, 30].
The first genome sequencing of radish was witnessed in 2014. The “Aokubi” inbred line having an S-h haplotype was sequenced through Illumina NGS [31]. This study also states that radish and Chinese cabbage share common ancestral genes. Second genome sequencing involved a combination of Illumina and Roche 454 methods [32]. Later, chromosome scale draft genome of radish was produced by Illumina, Roche, and PacBio NGS technologies [33]. In addition to cultivated radish, a wild relative, R. raphanistrum, was also sequenced through Illumina approach [34]. Still, efforts have to be driven to understand the relationships with the wild relatives, mining for QTL’s and developing useful markers for crop improvement.
The cucumber (Chinese Long inbred line 9930) was sequenced using both sanger and next-generation Illumina sequencing [35]. A semi-wild cultivar “GY14” (University of Wisconsin) and a wild cultivar “PI 183967” were also sequenced by Qi et al.[36]. A Spanish public–private initiative called MELANOMICS was started in 2009, which aimed at developing draft genome of melon using the NGS whole genome short gun. Melon’s chloroplast and mitochondrial genome was assembled, and surprisingly, mitochondria of melon contain one of the largest genomes ever reported in plants [37]. The key reasons for the difference in the genome size of cucumber (367 Mb) and melon (454 Mb) could be possibly due to transposable element amplification in melon as long terminal repeats [38] and lack of WGD (whole genome duplication) in the melon lineage. This variation resulted in the phenotypic as well as quality differences between melons and cucumber, such as genes related to stress and flavor [39]. Ancestral proto chromosome of cucurbits was observed to be 15 based on the syntenic relationships among the cucurbit genomes [40]. The watermelon cultivar “97103” was sequenced using Illumina and high-quality draft genome assembled (de novo) to a size of 353.5 Mb [41]. The watermelon genome has also undergone 27 fissions and 28 fusions depicting that it has more shuffling than bottle gourd [42]. This study validates the 11-chromosome structure of watermelon.
Carrot belongs to the Umbelliferae or Apiaceae family and the genome of orange carrot double haploid line “DC 27” was sequenced using the NGS whole genome sequencing short gun method [43]. A de novo assembly of it is performed using Illumina & Roche 454. In this study, for the first time in carrot, Chalcone-flavone isomerase (CHI), flavonoid 3’-monooxygenase (F3’H), and UDP-galactose gene sequences were identified. In 2016, chromosome scale whole-genome sequencing of DH 1 was prepared using Illumina at Beijing Genome Institute (BGI), 2000. Sequencing of intragenic region of mitochondrial genome provided the evidences for the transfer of DNA fragments between plastid and mitochondria [44].
A draft genome for Moringa oleifera was sequenced, representing an assembly of >90% of genome size [45]. Comparative analysis of it with woody species revealed the evolutionary relationship and also helped in the identification of species-specific genes. Subsequently, a broad range of genomics studies that have been done in different vegetable crops with estimated genome size and applied sequencing technology are listed in Table 1.1.
Table 1.1 Genomics studies on different vegetable crops using sequencing technology.
Family
Accession/Species
Estimated genome size
Sequencing technology
Reference
Solanaceae
Heinz 1706 (
S. lycopersicum
)
900 Mb
Sanger + Roche 454
Tomato Genome Consortium, 2012
LA 1589 (
S. pimpinellifolium
)
923 Mb
Illumina
LA0480 (
S. pimpinellifolium
)
900 Mb
Illumina
[46]
LA2093 (
S. pimpinellifolium
)
923 Mb
Illumina + PacBio
[11]
LA0716 (
S. pennellii
)
1.2 Gb
Illumina
[47]
LYC1722 (
S. pennellii
)
1.2 Gb
Nanopore
[48]
LA31111 (
S. chilense
)
1.2 Gb
Illumina
[49]
CM334 (
C. annuum
)
3.48 Gb
[17]
Zunla-1 (
C. annuum
)
3.26 Gb
[18]
Chiltepin (
C. a
. var.
glabriusculum
)
3.07 Gb
[18]
PI159236 (
C. chinense
)
3.14 Gb 3.2 Gb
[
17
,
50
]
PBC81 (
C. baccatum
)
3.9 Gb
[50]
C. annuum
3.26 Gb
10× Genomics
[19]
cv. Nakate-Shinkuro (
S. melongena
)
1.13 Gb
Illumina + Roche 454
[21]
cv. 67/3 (
S. melongena
)
1.04-1.21 Gb
Illumina
[22]
Guiqie 1 (
S. melongena
)
1.21 Gb
Illumina + Hi-C
[51]
HQ-1315 (
S. melongena
)
1.21 Gb
Illumina + Nnopore + Hi-C+10 × Genomics
[23]
Brassicaceae
B. rapa
spp.
pekinensis
485 Mg
Illumina +PacBio+ Hi-C
[
25
,
52
,
53
]
B. napus
1.13 Gb
Sanger +Illumina + 454
[26]
B. juncea
var.
tumida
922 Mb
PacBio+ Illumina
[29]
B. oleracea
var.
capitata
630 Mb
Sanger +454 +Illumina
[27]
Raphanus sativus
528.6 Mb
Sanger + Illumina
[31]
Cucurbitaceae
cv 9930 (
C. s.
var.
sativus
)
367 Mb
Sanger + Illumina + PacBio + 10X Genomics + Hi-C
[
35
,
51
,
54
]
cv. Gy 14
367 Mb
454
[55]
cv. B10
367 Mb
Sanger + 454
[56]
PI 183967 (
C. sativus
var.
hardwickii
)
367 Mb
Sanger + Illumina
[36]
Cucumis melo
450 Mb
Sanger + 454
[
39
,
57
]
Citrullus lanatus
425 Mb
Illumina + PacBio + Hi-C
[
41
,
42
,
58
]
Cucurbita maxima
386.8 Mb
Illumina
[30]
Cucurbita moschata
372 Mb
[30]
Cucurbita pepo
spp.
pepo
283 Mb
[59]
Cucurbita agryrosperma
spp.
agryrosperma
238 Mb
Illumina + PacBio
[60]
Lagenaria siceraria
334 Mb
Illumina
[61]
Momordica charantia
339 Mb
Illumina , PacBio
[
62
,
63
]
Benincasa hispida
1.02 Gb
Illumina + PacBio
[64]
Fabaceae
Vigna unguiculata
560 Mb
PacBio
[65]
Phaseolus vulgaris
587 Mb
Sanger + 454 + Illumina
[66]
Amaranthaceae
Spinacia oleracea
1 Gb
Illumina
[67]
Beta vulgaris
spp.
vulgaris
714-758 Mb
Sanger + 454 + Illumina
[68]
Asteraceae
cv. Salinas (
Lactuca sativa
)
2.7 Gb
Illumina
[69]
Apiaceae
Daucus carota
spp.
sativus
473 Mb
Sanger + Illumina
[44]
Coriandrum sativum
2.13 Gb
PacBio + Illumina + 10X genomics + Hi-C
[70]
Asparagaceae
Asparagus officinalis
1.3 Gb
PacBio + Illumina
[71]
Moringaceae
Moringa oleifera
315 Mb
Illumina
[45]
Genotyping by sequencing (GBS) was used in tomato cultivars consisting of four types i.e., large fruited, cherry fruited, grape fruited, and rootstocks, to generate SNPs [72]. A total of 10,615 SNPs were generated and five subsets were made, out of which one subset comprising 224 markers showed polymorphism in 91 cultivars tested and also able to distinguish 139 F1 cultivars from that of the reference genome. The results suggest the usefulness of these markers in DNA barcoding for identification of varieties (Table 1.2). GBS was also used in C. annuum and 109,610 SNPs were generated. These were used in QTL mapping and GWAS for capsaicinoid content of peppers [73]. Five candidate genes were identified successfully, from 69 QTL regions, which are involved in capsaicinoid biosynthesis [74]. The sequencing of sweet potato varieties led to the generation of various SNPs and InDels, which are related to starch biosynthesis [75].
Table 1.2 List of markers used for studying different cultivars of vegetable crops.
Family
Species
Marker type
Number of markers
Method
References
Solanaceae
S. lycopersicum
SNP
10,615
GBS
[72]
S. lycopersicum
SNP
3614
WGS
[76]
S. lycopersicum× S. pennellii
S. lycopersicum × S. pimpinellifolium
SNP
141,083
GBS
[77]
S. lycopersicum
SNP
4,812,432
Modified SAM
[17]
S. pimpinellifolium
SNP
4,680,647
Modified SAM
[17]
S. lycopersicum
SNP
8,784
GBS
[78]
S. melongena S. incanum
SSR
11,26211,829
De novo
transcriptome
[79]
S. melongena
SNP
10,000
RAD-seq
[80]
C. annuum
SNP
1.76 million
Resequencing
[81]
C. annuum
InDel
14,498
Whole genome resequencing
[82]
C. annuum
SNP
109,610
GBS
[74]
S. tuberosum
SNP
575,340
De novo
transcriptome
[83]
S. tuberosum
SNPInDel
27 million3 million
Resequencing
[84]
Convolvulaceae
Ipomoea batatus
SNPInDel
62285
GBS
[75]
Brassicaceae
B. napus
SSR
21,523
GWAS
[85]
B. oleraceae
SNPInDel
496,46337,493
WGS
[86]
B. napus
SNP
37,721
GBS
[87]
R. sativus
SNP
52,559
Rad-seq
[88]
Cucurbitaceae
Cucurbita
spp.
SNP
37,869
GBS
[89]
C. melo
SNP
375
GBS
[90]
C. sativus
SSR
2171
GWAS
[91]
C. lanatus
SNP
203,894-279,412
WGRS
[92]
M. charantia
InDel
389,487
WGS
[93]
L. siceraria
SSR
45,066
Rad-seq
[94]
Fabaceae
P. vulgaris
SNPInDel
43,698 1267
WGS
[95]
V. unguiculata
SNP
1031
GBS
[96]
Amaranthaceae
Amaranthus
spp.
SNP
27,658
WGS
[97]
S. oleracea
SSR
3852
WGS
[98]
Apiaceae
D. carota
SNP
3636
WGS
[99]
In red tomato skin, yellow-colored naringenin chalcone (NGC) of flavonoids determines the exterior color of the fruit [100], and this flavonoid also builds up naturally in the cuticle of red fruit skin as it ripens and causes the peel’s yellow color [101]. On the other hand, a transparent epidermis devoid of the yellow pigment NGC causes pink tomatoes. DNA markers obtained from the SlMYB12 gene of the Y locus on chromosome 1 would be helpful for marker-assisted selection (MAS) of tomato fruit color since NGC biosynthesis is controlled by this gene. The SlMYB12 gene, which has 4.9 kb, was transcribed from the line “FCR” (red-fruited YY) and the line “FCP” (pink-fruited YY) in order to create a gene-based marker. These SlMYB12 alleles’ sequence alignment showed no sequence differences between the “FCR” and “FCP” alleles. The DNA sequence of SlMYB12 was physically centered in between CAPS-456 and CAPS-38123, indicating that fruit peel color in cultivated tomato is controlled by SlMYB12 [102]. Most recently, the S. lycopersicum × S. pimpinellifolium RIL population was utilized to create a “ultra-high density” tomato genetic map. This population had 141,083 SNP markers divided into 2,869 genomic bins. Additionally, this map was employed for the fine mapping of genes and QTLs relating to tomato fruit weight and lycopene concentration [77]. A new high-density genetic bin map for tomato was created using Genotyping-by-Sequencing (GBS) and a distinct population of tomato plants (S. lycopersicum × S. pimpinellifolium). The map includes 1,195 genetic bins and 8,470 SNPs and was used to accurately locate the late blight resistance gene Ph-5 in tomatoes through fine mapping [24].
To identify single-nucleotide polymorphism (SNP) markers, a high-throughput genotyping by sequencing (GBS) in the 188 F5 population descended from the parents AR1 (PM resistant) and TF68 (PM susceptible) of Capsicum annum to provide powdery mildew resistance [103]. These SNP markers were then used to construct a genetic linkage map and perform QTL analysis. Each chromosome’s 1,308 SNP markers were used to create 12 linkage groups, with a total map length of 2506.8 cM. Moreover, two QTLs for Powdery mildew resistance, Pm-2.1 and Pm-5.1, were discovered on chromosome 2 and chromosome 5, respectively. Development of novel pepper cultivars with improved resistance to bacterial wilt disease will benefit from the identification of SNP markers linked to the resistance to the disease. Using the technique of whole-genome resequencing, Ahn et al.[104] discovered SNPs across the entire genome. Two pepper cultivars, Saengryeg 211 (sensitive) and 82PR66 (resistant), having different bacterial wilt resistance properties have genomes sequenced and compared to the reference sequence, C. annuum cv. CM334. The density of SNPs varied among the chromosomes, with the Saengryeg 211 and 82PR66 SNPs on chromosomes 10 and 11, respectively, having the highest density of SNPs. Intra- and inter-specific linkage maps for QTL mapping related to anthocyanin pigment in Brinjal (S. melongena) were studied by Barchi et al.[22]. SNPs were produced using high-throughput sequencing (Illumina) and restriction site-associated DNA (RAD). In order to study major breeding attributes in cucumber, such as fruit development [105, 106], parthenocarpy [107], the formation of trichomes that serve as a plant’s defense against biotic and abiotic stresses [108], primary plant regulatory processes in reaction to N deficit and transcriptome responses in C. sativus leaves [109], and the mechanism of melatonin-induced lateral root formation, the NGS technology and specific RNA-sequence methods are mentioned in Table 1.3.
In watermelon breeding, dwarfism is a valuable trait due to its contribution to higher yields and reduced labor in cultivation and harvesting. However, the genetic regulation for this trait is not well explained. To investigate this, researchers used NGS to analyze watermelon samples and identified a candidate dwarfism gene, Cla010726. They conducted a whole-genome re-sequencing from dwarf and vine pools DNA bulks (F2 population) and detected a genomic region containing the candidate gene through a genome-wide analysis of SNPs [112]. To develop a high-density linkage map for bitter gourd using genotyping-by-sequencing (GBS) technology and to perform QTL analysis for six major yield-contributing traits, studies were conducted by Rao et al.[113]. The study used a mapping population generated from the cross DBGy-201 × Pusa Do Mausami and identified 19 QTLs for the six quantitative traits. The QTLs derived from each parent had either a positive or a negative additive effect on trait scores. The phenotypic variation explained by the QTLs ranged from 0.09% to 32.65%, with a total of six major QTLs detected.
Table 1.3 Different NGS techniques used for targeted breeding of vegetable crops.
Sr. no.
Targeted breeding trait
NGS technique
References
1.
Fruit crops’ parthenocarpy, a crucial characteristic affecting yield and quality
RNA-seq HiSeq Illumina 2000
[107]
2.
Trichomes, which resemble epidermal hairs, serve as a plant’s defence mechanism for biotic and abiotic conditions
RNA-seq HiSeq Illumina 2000
[108]
3.
Early regulatory mechanism of plants in response to N starvation, transcriptome response of leaves under N deficiency
RNA-seq HiSeq Illumina 2000
[109]
4.
Molecular mechanisms of plant sex determination
RNA-seq Illumina
[110]
5.
Examine the factors that underlie melatonin-induced lateral root development in salt-stressed plants
RNA-seq Illumina
[111]
Metagenomics involves understanding of microbial and virus populations in targeted samples through a nucleic acid sequencing approach [114]. Very limited studies have been conducted in context to metagenomics of vegetable crops. Tomato production is threatened by Begomoviruses occurrence worldwide and therefore studies were made to understand the population dynamics of begomoviruses affecting tomato. Next-generation sequencing was employed to assess the diversity of single-stranded DNA viruses in tomatoes with or without the Ty-1 gene, which provides tolerance to begomoviruses. Leaf samples with begomovirus-like symptoms were enriched for circular DNA and sequenced using Illumina technology. Fifteen distinct viruses/subviral agents, mainly in mixed infections, suggest the implications of utilizing virus-specific resistance in tomato breeding. Additional viruses, including two distinct Begomovirus species, a new alpha-satellite, and a new gemycircularvirus, were found in tomatoes lacking the Ty-1 gene. The sole species of begomovirus that caused serious symptoms in Ty-1 crops was a unique begomovirus, which was only identified in the Ty-1 pool. This study sheds light on the potential adaptation of begomoviruses to Ty-1 and its effects on a subset of single-stranded DNA viral/subviral agents [115]. To investigate the microbial communities present in Chinese cabbage, NGS using 16S rRNA gene amplicon sequencing was employed. The results indicated that the bacterial communities on Chinese cabbage were primarily composed of Proteobacteria and Bacteroidetes, with Chryseobacterium, Aurantimonadaceae, Sphingomonas, and Pseudomonas being the most abundant genera. The study also detected diverse potential pathogens, such as Pantoea, Erwinia, Klebsiella, Yersinia, Bacillus, Staphylococcus, Salmonella, and Clostridium. Although further studies are required to determine the association between these potential pathogens and foodborne illness, the study suggests that metagenomic approaches can be used to detect pathogenic bacteria on fresh vegetables [116]. Gawande et al. [117] used the Illumina MiSeq platform to analyze the gut microbial community structure of adult Onion Thrips tabaci collected from 10 different agro-climatically diverse locations in India. The analysis revealed 1662 OTUs belonging to 21 bacterial phyla, with Proteobacteria being the predominant phylum. The Colorado potato beetle (CPB) is a serious insect pest that can develop resistance to insecticides. Using biological insecticides based on viruses may be a promising approach to control this pest, but there is limited information on viruses that infect leaf feeding beetles. The metagenomic analysis of 297 CPB genomic and transcriptomic samples was performed. The study discovered 3137 virus-positive contigs linked to various viruses from six types, 17 orders, and 32 families, corresponding to over 97 virus species. The indicated sequences were homologous to insect viruses, plant viruses, and endogenous retroviral element genetic sequences. The study’s findings imply that additional investigation may be useful in locating novel viruses to increase the range of biopesticides available to combat CPB [118].
The transcriptome is the functional derivative of the genome and may be defined as the total amount of transcripts existing in a cell at a given developmental stage as well as a particular physiological condition. It comprises all the different RNA molecules present in a cell at a particular time. Therefore, the study of the transcriptome has the potential to reveal the functional role of the genome. In addition to this, transcriptome sequencing can yield information about the genes of an organism at a significantly lower cost than genome sequencing, as here, only the transcribed regions have to be sequenced and studied. Today, many research projects focus on the study of the transcriptome rather than the genome as only around 1–2% of genes in a genome are coding. Some of the recent transcriptomic studies that highlight the multidimensional applications of NGS in the transcriptomics of vegetable crops are described here in detail.
a) Discovery of Novel Genes
Transcriptome analysis has most significantly helped in the identification of several key genes that play an important role in living systems. Several such genes have been identified in vegetable crops. For example, the genes CsACS, CsAsr1, and CsIAA2, identified by transcriptome analysis, were found to be responsible for sex determination in cucumber [110]. Another set of genes, namely, CsCUC3 and CsSTM, also in cucumber, were found to regulate fruit spine development [108]. This has also been extended into some of the tissue-specific genes. The revelation of several genes specifically active in root and leaves in carrots, involved in anthocyanin biosynthesis, will help accelerate breeding attempts to improve the nutritive value of carrots [119].
b) Understanding the Molecular Mechanisms Underlying Biological Functions
Understanding ripening can help us breed fruits with a better shelf life, which retain their nutritional quality for long periods. Mutant LeMADS-RIN, a MADS-box gene, has been found to inhibit ripening in tomato. A complete understanding of this mutation has the potential to explain the genetic regulatory pathways that result in the ripening of fruits [120] compared to the fruit transcriptomes of a mutant and non-mutant genotype to get better insights into the role of ethylene in ripening. Mutant cauliflowers having green curds due to ectopic chloroplasts were subjected to genome-wide profiling of gene expression by RNA-seq. As a result, several genes associated with the development and differentiation of chloroplasts were discovered. The results hinted the significance of regulatory genes in light signaling pathways [121]. In less-studied vegetable crops such as sweet potatoes, characterization of the transcriptome helps us unravel the molecular aspects of cellular processes such as the development and differentiation of roots and leaves, differential gene expression in different tissues, and potential responses of the plant to biotic and abiotic stress [122]. All the genomic and transcriptomic data available for Daucus carota sp. Sativus were compiled into Carrot DB (Carrot Data Base). This database contains tools for Genome Mapping and Basic Local Alignment Search Tool (BLAST). This is a potent opportunity that has infinite possibilities. It may be used for the discovery of sequences of novel or putative genes or scaffolds, for the development of SSR markers, to predict the sequences of different proteins found in carrot, etc. [43].
c) Development of Markers
Trick et al.[123] developed a method of SNP discovery in the polyploid B. napus cultivars, using transcriptome analysis and the Illumina platform. Single gene sequences of Brassica were aligned with sequence reads with the help of MAQ software. Consequently, 23,330–42,593 putative SNPs with different read depths were detected. Out of these, ∼90% of the SNPs were hemi-SNPs, which means that they were homozygous in one line but heterozygous in the other [124]. These hemi-SNPs can serve as powerful tools for genetic mapping. Blanca et al.[125] carried out their studies in the unexplored C. pepo plant. They sequenced many genes, found several allelic variations, and developed SSR and SNP markers as a result of their study. Their findings will accelerate the breeding prospects of squash. Such marker discovery was also carried out by Nicolai et al.[126] in capsicum.
d) Understanding Abiotic Stress
Any abiotic stress such as high temperature, soil salinity, excess humidity, drought, and metal deficiency or toxicity has an adverse effect on plants, and plants fight these stresses by altering their genetic responses. As a result, on exposure to any abiotic stress, the gene expression pattern in plants changes drastically. A transcriptomic analysis of these changes has huge research potentials. Lee and Choi [127] carried out a comparative transcriptome analysis in non-stressed and cold stressed pepper and identified many genes, hormones, and processes involved in cold stress tolerance. Drought tolerance has been observed in potato plants transformed with the yeast trehalose-6-phosphate synthase (ScTPS1) gene. The transcriptome profiles of the transgenics and wild types can be compared for differential expression patterns of genes in the two types of plants under drought stress conditions [128]. The molecular response of radish to the toxic heavy metal lead was studied by Wang et al. [129] using RNAseq technology. The upregulated differentially expressed genes (DEGs) under lead stress were mainly found to be involved in glutathione metabolism-related processes and cell wall defense. On the other hand, the downregulated DEGs were predominantly involved in carbohydrate metabolism-related pathways.
e) Identifying Viruses Infecting Vegetables
When any plant is attacked by viruses or viroids, several specific RNA molecules are synthesized. These RNAs are 21 to 24 nucleotides long and are called short interfering RNAs (siRNA). These RNAs survey the cytoplasm for viral genetic sequence, which may be homologous to the inducer and upon recognition immediately pairs with it to destroy it. This phenomenon is called RNAi or RNA silencing. NGS of siRNAs has the potential for the identification of viruses or viroids that infect plants. This can be a very sensitive tool for the identification as it will work even in very minute concentrations, in case of infections that do not show any symptoms. Most importantly, this will help to discover viruses and viroids previously unknown to mankind.
Polyploidization has played a major role during evolution in shaping complex genomes, especially in vegetable crops. Therefore, vegetables, like Brassicas, generally have large genome sizes with a large number of repeats. Such large genome sizes are very difficult, inconvenient, and costly to sequence and analyze by whole-genome sequencing of a large number of individuals even after the advent of NGS. One way to solve this problem, without compromising on the amount of information gathered, is called “Sequence Capture” or “Targeted Sequencing”. In this technique, there is a need to sequence only 1–2% of high-value sequences. These may be sequences rich in functional part of the genome or having low repetitive regions. This means that they include either specific genes of interest or even targets within these genes. When the captured sequence represents the entire protein coding region of the genome, it is called the exome.
Since a major portion of the genome is composed of repetitive and non-coding regions, this technique substantially reduces the bulk of the DNA to be sequenced while providing almost the same amount of information. This allows us to deeply focus on the important regions of the genome that are ultimately responsible for the production of genetic variation.
Captured sequencing may be carried out by one of the following methods:
Hybridization based sequence capture
PCR-based amplification
Selective circularization
[130]
Out of these, the hybridization-based approach is most commonly used as it allows a large amount of DNA to be analyzed by virtue of it being simple, fast, inexpensive and requires a minimum input of DNA [131]. Hybridization-based exome capture may be done with the help of arrays or in solution. The arrays contain probes from the sequence library, which act as baits for the enrichment of the target sequence on the arrays that can be then sequenced by NGS techniques.
Exome and captured sequencing can contribute to crop research in the following ways:
Identification of mutations in exomes and quick generation of novel and targeted polymorphisms
Exploration of biodiversity and gene mining
Development of SNP markers
Construction of genetic maps
Study the population structure, evolutionary history, and phylogenetic relationships
QTL mapping and gene identification
Genomic selection
At present, very limited studies using exome and captured sequencing have been conducted in vegetable crops.
In Manihot esculenta (Cassava), the transcriptome of 16 accessions was sequenced using the Illumina HiSeq platform. As a result, 675,559 EST-derived SNP markers were identified [132]. A subset of these markers were genotyped in 100 F1 progeny, which were developed by the cross of parents contrasting in starch viscosity, by capture-based targeted enrichment sequencing. As a result of this study, a major quantitative trait locus (QTL) regulating starch pasting properties was discovered. Moreover, a novel QTL associated with starch pasting time was identified. This information can further be used in research and breeding activities.
The nucleotide binding-site leucine-rich repeat (NBS-LRR) protein encodes for majority of disease resistance genes. NGS allows the identification of pathogen resistance gene families in this region. RenSeq technology was used in Solanum spp. (potato and tomato) and subsequently many NBS-LRRs were identified [133]. In such a way, new disease resistance genes can be identified.
Uitdewilligen et al.[134] used an in-solution hybridization method called “SureSelect” for DNA sequencing. This reduced the complexity of the tetraploid cultivars of potatoes and allowed them to focus on 807 target