Applied Biotechnology and Bioinformatics -  - E-Book

Applied Biotechnology and Bioinformatics E-Book

0,0
173,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

This comprehensive reference book discusses the convergent and next-generation technologies for product-derived applications relevant to agriculture, pharmaceuticals, nutraceuticals, and the environment.

The field of modern biotechnology is a multidisciplinary and groundbreaking area of biology that includes several cutting-edge methods due to developments in forensics and molecular modeling. Bioinformatics is a full-fledged multidisciplinary field that combines advances in computer and information technology. Numerous applications of bioinformatics—primarily in the areas of gene and protein identification, structural and functional prediction, drug development and design, folding of genes and proteins and their complexity, vaccine design, and organism identification—have contributed to the advancement of biotechnology. Biotechnology is also essential to crop improvement in agriculture because it allows genes to transfer across plants to increase traits such as disease resistance and yield. It also plays a broad role in healthcare, including genetic testing, gene therapy, pharmacogenomics, and drug development. Bioremediation and biodegradation, using microbial technologies to clean up environmental contamination, waste management technologies, and the conversion of organic waste to biofuels. Bioinformatics plays a critical role in analyzing different types of data created by high-throughput research methods—such as genomic, transcriptomic, and proteomic datasets—that are useful in addressing various problems related to disease management, clean environment, alternative energy sources, agricultural productivity, and more.

Audience

The book will interest biotechnology researchers and bioinformatics professionals working in the areas of applied biotechnology, bioengineering, biomedical sciences, microbiology, agriculture and environmental sciences.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 733

Veröffentlichungsjahr: 2024

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.


Ähnliche


Table of Contents

Cover

Table of Contents

Series Page

Title Page

Copyright Page

Preface

Part I: AGRICULTURE

1 Next-Generation Sequencing in Vegetable Crops

1.1 Introduction

1.2 Next-Generation Sequencing Approach in Genomics

1.3 NGS Approach in Single-Nucleotide Polymorphic Markers Development

1.4 Next-Generation Sequencing Approach in Trait-Specific Breeding

1.5 Next-Generation Sequencing Approach in Metagenomics

1.6 Next-Generation Sequencing Approach in Transcriptomics

1.7 Next-Generation Sequencing Approach in Exome and Captured Sequencing

1.8 Applications of Exome and Captured Sequencing in Crop Research

1.9 Conclusion and Future Prospects

References

2 Application of Bioinformatics Tools in Rice Genomics Research

2.1 Introduction

2.2 Role of Genomics in Rice Research

2.3 Model Plant for Genomic Research: Rice

2.4 High-Throughput Sequencing

2.5 Genome-Wide Association Study (GWAS)

2.6 Bioinformatics Approach to Study Stress Conditions in Rice

2.7 Application of Bioinformatics Tools in Advanced Rice Genomics Research

2.8 Current Challenges of Bioinformatics Tools for Rice Genomics Research

2.9 Conclusion

Conflict of Interest

References

3 Computer-Aided Vaccine Design: Applications in Agriculture

3.1 Introduction

3.2 Agriculturally Important Animals

3.3 Diseases Affecting Animal Health in Agriculture

3.4 Vaccination in Agriculture

3.5 Vaccine

3.6 Intervention of Computer in Vaccine Designing

3.7

In Silico

Vaccine Designing: Agricultural Applications

3.8 Conclusion and Future Prospects

References

4 Genomics to Phenomics: A Paradigm Shift in Crop Science Research

4.1 Introduction

4.2 Genomics in Crop Improvement

4.3 Advances in Genomics-Assisted Breeding

4.4 Phenotyping

4.5 Phenomics

4.6 Phenomics Approaches in Crop Improvement

4.7 Conclusion

References

Part II: PHARMACEUTICAL RESEARCH

5 Molecular Modeling and Drug Development

5.1 Introduction

5.2 Structure-Based Drug Design

5.3 Docking

5.4 Ligand-Based Drug Design

5.5 Pharmacophore

5.6 QSAR

5.7 Virtual Screening

5.8 Pharmacophore-Based VS

5.9 Similarity-Based VS

5.10 Homology Modeling and Protein Folding

5.11

In Silico

Pharmacokinetics

5.12 Conclusion

References

6 Comparative Study on Tannase Sequence and Structure of

Lactiplantibacillus

: An

In Silico

Protein Variability Analysis and Its Impact on Microbial Speciation

6.1 Introduction

6.2 Materials and Methods

6.3 Results and Discussion

6.4 Conclusion

References

7 Probiotics: A Novel Natural Therapy for Oral Health

7.1 Introduction

7.2 Background

7.3 Mechanism in Oral Diseases Prevention by Probiotics

7.4 Probiotic Formulation

7.5 Prevention and Oral Health Management

7.6 Concluding Remarks

7.7 Future Aspects

References

8 The Preventative and Curative Functions of Probiotics: A Paradigm of Food as Drug Revolution

8.1 Introduction

8.2 Criteria for Choosing Probiotics and the Bare Minimum Needed

8.3 Action Mechanism of Probiotics

8.4 Probiotics in the Clinical Practice: A Growing Trend

8.5 Potential Preventative Roles of Probiotics

8.6 Therapeutic Use of Probiotics

8.7 Recent Advancement in Probiotics

8.8 Conclusion and Recommendation

Acknowledgments

References

9 Probiotics in the Prevention and Treatment of Psoriasis

9.1 Introduction

9.2 Interruption of the Microbiome: A Pathogenic Effect in Psoriasis

9.3 Therapeutic Effect of Probiotics for Psoriasis

9.4 Conclusion

References

10 A Gateway to Multi-Omics-Based Clinical Research

10.1 Introduction

10.2 Importance of Multi-Omics

10.3 Genomics and Relevant Clinical Studies Along with Its Tools and Methods

10.4 Proteomics and Relevant Clinical Studies Along with Its Tools and Methods

10.5 Sample Type and Acquisition

10.6 Various Data Acquisition Methods for Proteomics Data Include the Following

10.7 Techniques Used in Clinical Proteomics

10.8 Analysis Tools in Clinical Proteomics

10.9 Metabolomics and Relevant Clinical Studies Along with Its Tools and Methods

10.10 Different Types of Metabolomics

10.11 Techniques and Tools Used in Metabolomics

10.12 Metabolite Databases

10.13 Data Analysis Tools and Software

10.14 Application of Metabolomics in Clinical Studies

10.15 Conclusion

References

11 Inherent Observation of Mucosal Non-Specific Immune Parameters in Indian Major Carps

11.1 Introduction

11.2 Materials and Methods

11.3 Results and Discussion

11.4 Conclusion

References

Part III: ENVIRONMENT

12 Eco-Friendly Approaches for Converting Organic Waste to Bioenergy for Sustainable Development

12.1 Introduction

12.2 Organic Waste in the Bioenergy Generation

12.3 Categories and Characteristics of Organic Waste

12.4 Organic Waste Based on Origin

12.5 Organic Waste Based on the State of Matter

12.6 Organic Waste Based on the Level of Production

12.7 Characteristics of Organic Waste

12.8 Greenhouse Gases (GHGs)

12.9 Benefits of Organic Waste

12.10 Current and Prospective Use of Organic Waste

12.11 Sustainable Bioenergy and Biofuels from Organic Waste

12.12 Conversion of Organic Waste into Bioenergy and High-Valued Products

12.13 Biofuels from Organic Waste: Biochemical and Thermochemical Processes

12.14 Fermentation

12.15 Anaerobic Digestion

12.16 Combustion

12.17 Pyrolysis

12.18 Gasification

12.19 Biorefinery Concept Based on Organic Waste for Clean Energy Management

12.20 Success and Challenges of Organic Waste for Bioenergy

12.21 Conclusion and Recommendations

References

13 Utilization of Food Waste for Bioenergy Production

13.1 Introduction

13.2 Potential of Food Waste for Bioenergy Production

13.3 Bioenergy from Food Waste

13.4 Conclusion

References

14 Photosynthetic Microalgal Microbial Fuel Cell (PMMFC): A Novel Strategy for Wastewater Treatment and Bioenergy Generation

14.1 Introduction

14.2 Microbial Fuel Cell

14.3 Types of PMFC

14.4 Role of Algae in PMFC

14.5 Conclusion

References

15 Self-Cleaning Aquarium: The Microbial Biofilm Approach for Ammonia Bioremediation

15.1 Current Scenario of Fresh Water Scarcity and Impact of Aquaculture

15.2 Existing Technologies for Aquaculture Effluent Treatment for Environmental Sustenance

15.3 The Novel Rapid Biofilm Reactor-Based Ammonia Removing System

15.4 The Case Study of the Self-Cleaning Aquarium

15.5 Conclusion and Future Application

Acknowledgments

References

16 Metagenomics Unveiled: Deciphering Microbial Responses to Climate Change

16.1 Introduction

16.2 Climate Change and Its Impact on the Environment and Microbiome

16.3 Metagenomics as a Tool for Climate Change Research

16.4 Microbial Adaptation to Climate Change

16.5 Feedback Loops and Climate Change

16.6 Metagenomics in Climate Change Mitigation

16.7 Case Studies and Research Findings

16.8 Metagenomic Climate Model Frame

16.9 Challenges and Future Directions

16.10 Conclusion

Acknowledgments

Author Contributions

Conflict of Interest

References

17 Biosensor: A Tool for Assessment of Soil Pollutants

17.1 Introduction

17.2 Working Principles

17.3 Types of Biosensors

17.4 Application of Biosensors

17.5 Advantages, Disadvantages, and Adoption of Biosensors

17.6 Ethical Considerations and Future Challenges

17.7 Conclusion

References

18 Transcriptome-Guided Characterization of Molecular Resources in Mussels

18.1 Introduction

18.2 Species of Mussels Sequenced at the Transcriptome Level

18.3 Transcriptome Pipeline for Mussel Molecular Resources

18.4 Mussel Transcriptome Assembly and Annotation

18.5 Conclusions and Future Perspectives

Acknowledgments

References

Index

End User License Agreement

List of Tables

Chapter 1

Table 1.1 Genomics studies on different vegetable crops using sequencing techn...

Table 1.2 List of markers used for studying different cultivars of vegetable c...

Table 1.3 Different NGS techniques used for targeted breeding of vegetable cro...

Chapter 2

Table 2.1 Widely used bioinformatics databases and resources for rice genomics...

Table 2.2 A list of application of advanced CRISPR/Cas tools in rice.

Table 2.3 Popular CRISPR guide RNA designing tools in rice.

Table 2.4 The servers used to identify the Acr proteins and their specificatio...

Table 2.5 List of

in silico

tools available to find most probable offtarget si...

Chapter 3

Table 3.1 Major tools for sequence-based epitope prediction.

Table 3.2 Major tools for structure-based epitope prediction.

Chapter 4

Table 4.1 Applications of high-throughput phenotyping integrated with GWAS in ...

Chapter 6

Table 6.1 Name of target organisms with their UniProt ID and similarity of tan...

Table 6.2 Amino acid distribution of tannase from

Lactiplantibacillus

sp. in t...

Table 6.3 Intra-protein interaction in tannase from species of

Lactiplantibaci

...

Chapter 7

Table 7.1 Various oral probiotic supplements have medicinal benefits. Sourced ...

Chapter 8

Table 8.1 Potential anti-cancer effects of probiotics.

Chapter 9

Table 9.1 Evidence of beneficial probiotic interventions in treating psoriasis...

Table 9.2 Clinical trial of probiotics for the treatment of psoriasis (https:/...

Chapter 11

Table 11.1 Lysozyme activities of epidermal mucus extracts of

L. rohita, C. ca

...

Table 11.2 Alkaline phosphatase activities of epidermal mucus extracts of

L. r

...

Table 11.3 Protease activities of epidermal mucus extracts of

L. rohita, C. ca

...

Chapter 12

Table 12.1 Different sources of organic waste (food waste) and their potential...

Table 12.2 Organic waste as a potential source of bioenergy.

Table 12.3 Organic wastes as a potential source of other high-valued products.

Chapter 13

Table 13.1 Source of food waste for generating energy.

Table 13.2 Production of bioenergy from food wastes using different methodolog...

Chapter 15

Table 15.1 Average concentration of ammonia in the bioreactors in mg/L.

Table 15.2 Statistical validation of the data (

p

value) showing significant va...

Table 15.3 Protocol for COD measurement.

Table 15.4 Average concentration of ammonia in the bioreactors with 2-fold dil...

Table 15.5 Statistical validation of the data showing significant variation wi...

Table 15.6 Chemical oxygen demand (COD) in mg/L at different time point in the...

Table 15.7 Percentage reduction of COD(%) at different time point in the three...

Table 15.8 Water quality before and after 3 months of treatment in the commerc...

Chapter 16

Table 16.1 Glimpses of some recently studied metagenomics sequencing approache...

Chapter 17

Table 17.1 Commonly used biosensors for heavy metal detection.

Table 17.2 Advantages and disadvantages of biosensors while detecting soil pol...

Chapter 18

Table 18.1 Representation of mussel genomes in the public domain.

Table 18.2 A methodological summary of Next-Generation Sequencing (NGS) studie...

Table 18.3 Comparison of mussel transcriptome assembly and annotation.

Table 18.4 Summary of mussel transcriptomics in the last 5 years (2019–2023)* ...

List of Illustrations

Chapter 1

Figure 1.1 Timeline of next-generation sequencing technology.

Chapter 3

Figure 3.1 The general steps involved in vaccine designing.

Figure 3.2 Basic steps to develop a vaccine through computer-aided vaccine des...

Chapter 6

Figure 6.1 3D structure of template tannase protein, i.e.,

Lactiplantibacillus

...

Figure 6.2 Amino acid abundance in tannase of

Lactiplantibacillus plantarum

(r...

Figure 6.3 Ramachandran plot showing amino acid distribution of tannase belong...

Figure 6.4 Ramachandran plot showing amino acid distribution of tannase belong...

Figure 6.5 (a) RMSD and (b) RMSF plots of tannase of

Lactiplantibacillus plant

...

Figure 6.6 (a) Rg and (b) SASA of tannase of

Lactiplantibacillus plantarum

(re...

Figure 6.7 Hydrogen bonds in tannase of

Lactiplantibacillus plantarum

,

L. pent

...

Chapter 7

Figure 7.1 Mechanistic actions of probiotics against dental caries.

Chapter 8

Figure 8.1 Schematic depiction of the numerous functions of probiotics.

Figure 8.2 Changes in the biology of cancer cells were triggered by

Lactobacil

...

Figure 8.3 Intestinal inflammation can be triggered by bacteria in the gut and...

Figure 8.4 A schematic diagram exhibiting role of probiotics for hypertension ...

Figure 8.5 The potential mode of action of the probiotic for the treatment of ...

Figure 8.6 A schematic depicting the therapeutic potential of genetically engi...

Figure 8.7 Probiotics’ Nobel mechanisms of action in antioxidation Figure adop...

Chapter 9

Figure 9.1 Gut–skin axis microbiota.

Chapter 10

Figure 10.1 The relationship between numerous clinical samples and various dom...

Figure 10.2 This figure depicts top-down and bottom-up approaches in proteomic...

Figure 10.3 Describing about processes required to perform metabolic analysis ...

Chapter 11

Figure 11.1

SDS-PAGE analysis of mucus extracts of C. catla, L. rohita, and C.

...

Chapter 12

Figure 12.1 Organic waste in the production of different biofuels for a health...

Figure 12.2 Sources of organic waste based on origin.

Chapter 13

Figure 13.1 Schematic diagram of biochemical process for the production of bio...

Figure 13.2 Biofuel production from industrial, agriculture and home food wast...

Chapter 14

Figure 14.1 Schematic diagram of MFC.

Figure 14.2 Schematic diagram of photosynthetic microbial fuel cell.

Figure 14.3 (a) Dual-chambered PMFC, (b) tubular PMFC, (c) single-chambered ai...

Figure 14.4 Mechanism of carbon sequestration.

Chapter 15

Figure 15.1 Pictures of aquarium with raschig rings. Pictures in the two rows ...

Chapter 16

Figure 16.1 Pictorial representations of causes for climate change and its con...

Figure 16.2 Schematic diagram of climate change mitigation strategies through ...

Figure 16.3 A conceptual model frame to understand microbiome functioning and ...

Chapter 17

Figure 17.1 Flowchart showing the mechanism of the working principle of biosen...

Figure 17.2 Type of biosensors commonly used to detect contaminants in soil en...

Chapter 18

Figure 18.1 An overview of the transcriptome initiatives involving mussels. Il...

Guide

Cover Page

Table of Contents

Series Page

Title Page

Copyright Page

Preface

Begin Reading

Index

WILEY END USER LICENSE AGREEMENT

Pages

ii

iii

iv

xvii

xviii

1

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

437

438

439

440

441

442

Scrivener Publishing100 Cummings Center, Suite 541JBeverly, MA 01915-6106

Publishers at ScrivenerMartin Scrivener ([email protected])Phillip Carmical ([email protected])

Applied Biotechnology and Bioinformatics

Agriculture, Pharmaceutical Research and Environment

Edited by

Hrudayanath Thatoi

Center for Industrial Biotechnology Research, Siksha ‘O’ Anusandhan University, Bhubaneswar, Odisha, India

Sonali Mohapatra

Dept. of Biological Systems Engineering, University of Wisconsin, Madison, USA

Swagat Kumar Das

Dept. of Biotechnology, Odisha University of Technology and Research, Bhubaneswar, Odisha, India

and

Sukanta Kumar Pradhan

Dept. of Bioinformatics, Odisha University of Agriculture and Technology, Bhubaneswar, Odisha, India

This edition first published 2025 by John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA and Scrivener Publishing LLC, 100 Cummings Center, Suite 541J, Beverly, MA 01915, USA© 2025 Scrivener Publishing LLCFor more information about Scrivener publications please visit www.scrivenerpublishing.com.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

Wiley Global Headquarters111 River Street, Hoboken, NJ 07030, USA

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Limit of Liability/Disclaimer of WarrantyWhile the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchant-ability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials, or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read.

Library of Congress Cataloging-in-Publication Data

ISBN 978-1-119-89640-1

Front cover images courtesy of Wikimedia CommonsCover design by Russell Richardson

Preface

The field of modern biotechnology is a multidisciplinary, groundbreaking, and advantageous area of biology that includes several cutting-edge methods. The discipline of biotechnology is expanding at a rapid, previously unseen pace due to developments in forensics, molecular modeling, clinical healthcare, pharmaceuticals, agriculture, environmental bioremediation, renewable energy, and many other areas. In biological science, bioinformatics is an advanced field of research that has grown to an unprecedented height.

Presently, bioinformatics, a full-fledged multidisciplinary field that combines advances in computer and information technology, has made great progress in its applications to the field of biotechnology and biological sciences. Numerous applications of bioinformatics—primarily in the areas of gene and protein identification, structural and functional prediction, drug development and design, folding of genes and proteins and their complexity, vaccine design, and organism identification—have contributed to the advancement of biotechnology. Bioinformatics play a critical role in analyzing different types of data created by high-throughput research methods—such as genomic, transcriptomic, and proteomic datasets—that will be useful in addressing various problems related to disease management, clean environment, alternative energy sources, agricultural productivity, and more.

Biotechnology is essential to crop improvement in agriculture because it allows genes to transfer across plants to increase traits such as disease resistance and yield. Biotechnology plays a broad role in healthcare, including genetic testing, gene therapy, pharmacogenomics, and drug development. Bioremediation and biodegradation, using microbial technologies to clean up environmental contamination, are critical in the current scenario. In this context, bioinformatics employs tools to analyze and mechanize different degradation pathways that help in microbial applications.

This book is comprised of eighteen unique chapters, each written by renowned researchers in the fields of microbiology, bioinformatics, agriculture, food, pharmaceuticals, and bioremediation. It provides an in-depth discussion on emerging topics while concentrating on the most recent research about the applications of biotechnology and bioinformatics in agriculture, nutraceuticals, pharmaceuticals, and the environment. General readers, scholars, biotechnology researchers, and bioinformatics professionals working in the areas of applied biotechnology, bioengineering, biomedical sciences, microbiology, and environmental sciences will come away from this book with a wider understanding of recent innovations, tools, techniques, and applications.

The editors express their gratitude to the esteemed writers for providing chapters in this book that showcase their superb work and extensive expertise in their respective fields of research. Finally, the editors thank Martin Scrivener and Scrivener Publishing for their assistance and publication of this book.

Hrudayanath Thatoi

Sonali Mohapatra

Swagat Kumar Das

Sukanta Kumar Pradhan

Part IAGRICULTURE

1Next-Generation Sequencing in Vegetable Crops

Meenu Kumari1*, Tanya Barpanda2, Meghana Devireddy3, Ankit Kumar Sinha3, R. S. Pan1 and A. K. Singh1

1ICAR-Research Complex for Eastern Region, RS, Ranchi, India

2Orissa University of Agriculture & Technology, Odisha, India

3ICAR-Indian Agricultural Research Institute, New Delhi, India

Abstract

The last few decades have witnessed revolutionary advances in all biological disciplines in DNA sequencing technologies at a fraction of the cost with respect to traditional sequencing. There are two methods of crop breeding, i.e., conventional approach through hybridization followed by selection and marker-assisted selection (MAS). Limitations of conventional approach like long periods of selection to fix a trait in the breeding population, environmental effect, and low efficiency for complex and less heritable traits lead breeders to choose MAB. It necessitates the use of different molecular markers based on the availability of information on linkage of traits with markers. However, MAB efficiency was convenient to explore traits that are governed by few numbers of quantitative trait loci (QTLs), whereas for complex traits like yield, quality, biotic stress, and abiotic stress, which are governed by a large number of minor QTLs, Marker-Assisted Selection is not effective. The next-generation sequencing (NGS) approach opened an era of data science where millions of bases are being sequenced in one round and extremely reduced the time and cost of sequencing. In this chapter, we describe the status, recent development, and application of NGS in vegetable crops for their utilization in practical improvement approaches.

Keywords: MAB, QTL, GBS, transcriptomics, GBS, NGS

1.1 Introduction

The next-generation sequencing (NGS) approach has exemplary development in the field of life sciences and shifted sequencing studies from “model organism” to “every organism” with the power of high-throughput NGS technology. NGS techniques became commercially available in 2005, and since then, there has tremendous development at an astonishing rate in this field like the evolutionary process to address important and unexplored questions of plant systems. These approaches can be broadly divided into three main categories: sequencing by synthesis, sequencing by ligation, and single-molecule sequencing. However, emerging technologies made it possible without library preparation, like the use of quantum dot (qdot)-derived fluorescence resonance energy transfer (FRET) to detect fluorescently labeled nucleotide incorporation. Another sequencing approach is nanopore sequencing, where chemical or electronic properties of bases (DNA or RNA) can be analyzed directly while passing through nanopores. The NGS evolution trend with the advancement of technologies has been depicted in Figure 1.1. This has exemplary development in the field of life sciences and shifted sequencing studies from “model organism” to “pan organism” with the power of high-throughput NGS technology.

Figure 1.1 Timeline of next-generation sequencing technology.

With the advancement of high-throughput sequencing platforms, data have been generated for millions of plant species and, therefore, understanding of plant system at the nucleotide level has been enriched in horticultural crops like fruits, vegetables, spices, plantations, etc.

1.2 Next-Generation Sequencing Approach in Genomics

1.2.1 Solanaceous

Among vegetables, the tomato was a pioneer in identifying the genetic basis of quantitative traits and in the map-based cloning of genes and quantitative trait loci (QTLs) [1]. Through the creation of extensive molecular marker libraries [2], genetic and physical maps [3], and mapping populations [4], tomato has served as a model plant for improvement and inheritance studies since the early 1990s. The Tomato Genome Consortium started the project called “SOL-100” associated with NGS-based sequencing of 100 different species of Solanaceae family and relating their sequences to the reference genome (solgenomic.net).

Completed genome

Draft genome

Projects based on resequencing

Solanum lycopersicumSolanum tuberosumCapsicum annuumSolanum melongenaSolanum lycopersicoidesSolanum pennelliiSolanum pimpinellifoliumSolanum lycopersicum

var.

cerasiforme

Iochroma cyaneum

Nicotiana attenuataNicotiana benthamianaNicotiana tabacumPetunia axillarisPetunia inflataSolanum chilenseCoffea humblotiana

Solanum lycopersicum

inbreds150 Tomato Genome Resequencing ProjectBGI Tomato 360 genomesVaritome Project

The tomato cultivar “Heinz 1706” is estimated to have a genome size of approximately 900 Mb with a simple structure composed of two main components: pericentromeric heterochromatin with repetitive sequences, occupying 75% of the whole genome, and distal euchromatin comprising the remaining 25% (220 Mb). It was sequenced through the BAC-by-BAC approach, resulting in the sequencing of 117 Mb of euchromatic regions precisely [5]. The main reason behind choosing this particular cultivar was because of its well-characterized HindIII BAC library available at that time [6]. In 2008, 30,800 BAC clones were selected, pooled, and short gun sequencing was done using the Sanger method of sequencing to accelerate the sequencing progress (The Tomato Genetic Consortium.2012). The sequences were congregated into contigs that elucidated 540 Mb of the genome. In 2009, the emerging NGS platforms paved the way for the sequencing consortium to plan for whole genome sequencing, which was earlier confined to only sequencing of euchromatin regions. Different variants of NGS technologies like 454/Roche GS FLX, Illumina Genome Analyser, and SOLiD sequencing were used. Using Sanger data, a de novo sequencing of “Heinz 1706” was assembled. Assemblies were generated by independent programs like Newbler and CABOG, and merged later on. Read-mapping and base error correction resulted in more accurate data where one base calling error per 29.4 kb and one indel error per 6.4 kb were obtained [7, 8]. Two BAC-based physical maps were used to connect the resulting high-quality scaffolds, and a high-density genetic map was used to anchor them [9] as well as with introgression line mapping and genomewide BAC-FISH (Fluorescence In Situ Hybridization). The final assembly of tomato genome consisted of 760 Mb involving of 91 scaffolds. Most of the gaps identified after aligning the scaffolds with 12 chromosomes were confined to pericentromeric regions [7]. The consortium also sequenced the LA1589 accession of wild S. pimpinellifolium to explore the diversity it possesses through variation with the reference genome of “Heinz 1706”. It was performed by Illumina technology using the whole-genome short gun sequencing approach. An assembly of 739 Mb was congregated, which, when compared with reference genome, showed the nucleotide divergence of only 0.6%, indicating the high level of similarity among two species (The Tomato Genome Consortium 2012). The first resequencing in tomato was attempted on 4 S. lycopersicum and 4 S. l. cerasiformae genotypes through the Illumina GAIIx platform, which generated a total of 4 million unique SNPs, 1,686 putative copy-number variations (CNV), and almost 1,28,000 InDels [10]. These variations can be utilized for QTL and gene mapping. Long-read technologies made further improvements in sequencing high-quality reference genomes including S. pimpinellifolium accessions like LA2093, LA1589 [11], LA1670 [12], S. l. var. cerasiformae acc. LA1673 [12], S. l cv. Moneyberg [13], S. l cv. Heinz 1706 [14], and S. lycopersicoides acc. LA2951 [15]. A total of 100 tomato accessions were sequenced using nanopore technology to detect variations and de novo assemblies were released for 14 reference genomes [16]. A total of 2,38,490 structural variants (mostly insertions and deletions) were discovered. The functional analysis allowed the linking of these variants to three major traits important for domestication and improvement targets: smoky flavor (not preferable among consumers), sb1 (for branching patterns), and fw3.2 (major QTL for fruit mass).

Capsicum annuum cv. CM334 was sequenced with high genome coverage and is considered as a reference for genome sequence of pepper [17]. The genome size was estimated to be 3.48 Gb, which is very high when compared to its fellow crops of same family. An increase in the genome size of pepper is supposed to be because of long terminal repeats of retrotransposons. In addition, CM334 genome has 76.4% transposable elements. This study also provided deeper insights into pepper pungency causing capsaicinoid biosynthesis pathway. For a better understanding of evolution and domestication, another cultivar of C. annuum var. glabriusculum (Zunla-1 and wild Chiltepin) was sequenced using Illumina technology through the whole-genome short gun approach [18]. They found 1104 target genes responsible in the capsaicinoid biosynthesis pathway, which indicates miRNAs for regulation. Resequencing data of 20 genomes were compared and found that domestication happened through artificial selection. Phylogenetic analysis was done by comparing the genomes of tomato, potato, and Arabidopsis genomes with pepper, which led to the discovery of pepper-specific duplications in 13 gene families. A heterozygous F1 from a wide cross was used for sequencing and was assembled to assess the ability to derive both haplotypes [19]. A pungency gene at the PUN1 locus was derived with large insertion/deletions that facilitated marker-assisted selection for pepper improvement.

The potato being a highly heterozygous polyploid, sequencing is much difficult when compared to diploids because for any given gene in a genotype, four different alleles per locus will be depicted. Thus, genotyping systems should differentiate among alleles and must be quantifying the copy number of allele. In order to overcome this problem, double monoploids were used for easy sequencing. Such a unique doubled monoploid was used for sequencing high-quality draft genome of 844 Mb using a combination of methods like Sanger, Roche/454, and Illumina and de novo assembly was done and is considered as a potato reference genome [20]. This had provided understanding into the evolution of genome of eudicots. Additionally, 3.67 million SNPs and 275 gene-specific (presence/absence) alterations were identified, concluding that homozygous alleles in the double monoploids is the reason for reduced vigor. A total of 15,235 genes were found in their full expression in developing tubers, giving valuable insights in the study of evolution of tuber development. Evolutionary innovation of tuberization has been confined to only the Petota section of Solanum. Although tomato is a neighboring species, it did not acquire this trait. These insights ignite the urge to study the evolutionary pathway further, to draw a deeper understanding about this genus.

Eggplant sequencing stands out among Solanaceae because it is phylogenetically unique being indigenous to Old World, whereas other crops of this family like tomato, chili pepper, and potato originated in South America. First draft genome of eggplant cv. Nakate-Shinkuro was built using a HiSeq 2000 Illumina sequencer, resulting in 33,873 scaffolds that depicted 74% of the whole genome [21]. Also, the reference genome of inbred line “67/3” was sequenced through Illumina and de novo assembled [22]. An inbred line HQ-1315 was assembled using an amalgamation of Illumina, Nanopore, and 10x genomic sequencing technologies to produce high-quality reference genome of ∼1.17 Gb size that was assembled using Hi-C technology [23]. High genome acreage is associated with long terminal repeat (LTR) retrotransposons comprising 70.09% of the whole genome.

1.2.2 Malvaceae

Okra is one of the major vegetable crops belonging to the Malvaceae family. Plastids and mitochondrial genome of okra were sequenced through a combination of Illumina and Nanopore NGS [24]. This study found that plastid genome is a bit more conserved whereas mitochondrial genome has subgenomic configurations. They also observed immense transfer of sequences between the organelles, for instance, the presence of plastid genes (psaA, rps7, and psbJ) in mitochondrial genome.

1.2.3 Brassicaceae

Brassica genome sequencing consortium was started in 2003, initially confined to sequencing of B. rapa diploid A genome through BAC approach and Sanger technique. With the advent of NGS techniques, many genome sequencing projects emerged. The International B. rapa Genome Sequencing Project Consortium released B. rapa (Chinese Cabbage) genome annotated with 41,174 protein coding genes [25]. This study provided new perception into the expansion of the Brassica lineage. Second sequencing attempts were initiated in B. oleracea diploid C genome using the combination of Roche/454 and Illumina. Later on, reference genomes for all three Brassica progenitors and allopolyploids species like B. napus and B. juncea were released [25–28, 30].

The first genome sequencing of radish was witnessed in 2014. The “Aokubi” inbred line having an S-h haplotype was sequenced through Illumina NGS [31]. This study also states that radish and Chinese cabbage share common ancestral genes. Second genome sequencing involved a combination of Illumina and Roche 454 methods [32]. Later, chromosome scale draft genome of radish was produced by Illumina, Roche, and PacBio NGS technologies [33]. In addition to cultivated radish, a wild relative, R. raphanistrum, was also sequenced through Illumina approach [34]. Still, efforts have to be driven to understand the relationships with the wild relatives, mining for QTL’s and developing useful markers for crop improvement.

1.2.4 Cucurbitaceae

The cucumber (Chinese Long inbred line 9930) was sequenced using both sanger and next-generation Illumina sequencing [35]. A semi-wild cultivar “GY14” (University of Wisconsin) and a wild cultivar “PI 183967” were also sequenced by Qi et al.[36]. A Spanish public–private initiative called MELANOMICS was started in 2009, which aimed at developing draft genome of melon using the NGS whole genome short gun. Melon’s chloroplast and mitochondrial genome was assembled, and surprisingly, mitochondria of melon contain one of the largest genomes ever reported in plants [37]. The key reasons for the difference in the genome size of cucumber (367 Mb) and melon (454 Mb) could be possibly due to transposable element amplification in melon as long terminal repeats [38] and lack of WGD (whole genome duplication) in the melon lineage. This variation resulted in the phenotypic as well as quality differences between melons and cucumber, such as genes related to stress and flavor [39]. Ancestral proto chromosome of cucurbits was observed to be 15 based on the syntenic relationships among the cucurbit genomes [40]. The watermelon cultivar “97103” was sequenced using Illumina and high-quality draft genome assembled (de novo) to a size of 353.5 Mb [41]. The watermelon genome has also undergone 27 fissions and 28 fusions depicting that it has more shuffling than bottle gourd [42]. This study validates the 11-chromosome structure of watermelon.

1.2.5 Apiaceae

Carrot belongs to the Umbelliferae or Apiaceae family and the genome of orange carrot double haploid line “DC 27” was sequenced using the NGS whole genome sequencing short gun method [43]. A de novo assembly of it is performed using Illumina & Roche 454. In this study, for the first time in carrot, Chalcone-flavone isomerase (CHI), flavonoid 3’-monooxygenase (F3’H), and UDP-galactose gene sequences were identified. In 2016, chromosome scale whole-genome sequencing of DH 1 was prepared using Illumina at Beijing Genome Institute (BGI), 2000. Sequencing of intragenic region of mitochondrial genome provided the evidences for the transfer of DNA fragments between plastid and mitochondria [44].

1.2.6 Moringaceae

A draft genome for Moringa oleifera was sequenced, representing an assembly of >90% of genome size [45]. Comparative analysis of it with woody species revealed the evolutionary relationship and also helped in the identification of species-specific genes. Subsequently, a broad range of genomics studies that have been done in different vegetable crops with estimated genome size and applied sequencing technology are listed in Table 1.1.

Table 1.1 Genomics studies on different vegetable crops using sequencing technology.

Family

Accession/Species

Estimated genome size

Sequencing technology

Reference

Solanaceae

Heinz 1706 (

S. lycopersicum

)

900 Mb

Sanger + Roche 454

Tomato Genome Consortium, 2012

LA 1589 (

S. pimpinellifolium

)

923 Mb

Illumina

LA0480 (

S. pimpinellifolium

)

900 Mb

Illumina

[46]

LA2093 (

S. pimpinellifolium

)

923 Mb

Illumina + PacBio

[11]

LA0716 (

S. pennellii

)

1.2 Gb

Illumina

[47]

LYC1722 (

S. pennellii

)

1.2 Gb

Nanopore

[48]

LA31111 (

S. chilense

)

1.2 Gb

Illumina

[49]

CM334 (

C. annuum

)

3.48 Gb

[17]

Zunla-1 (

C. annuum

)

3.26 Gb

[18]

Chiltepin (

C. a

. var.

glabriusculum

)

3.07 Gb

[18]

PI159236 (

C. chinense

)

3.14 Gb 3.2 Gb

[

17

,

50

]

PBC81 (

C. baccatum

)

3.9 Gb

[50]

C. annuum

3.26 Gb

10× Genomics

[19]

cv. Nakate-Shinkuro (

S. melongena

)

1.13 Gb

Illumina + Roche 454

[21]

cv. 67/3 (

S. melongena

)

1.04-1.21 Gb

Illumina

[22]

Guiqie 1 (

S. melongena

)

1.21 Gb

Illumina + Hi-C

[51]

HQ-1315 (

S. melongena

)

1.21 Gb

Illumina + Nnopore + Hi-C+10 × Genomics

[23]

Brassicaceae

B. rapa

spp.

pekinensis

485 Mg

Illumina +PacBio+ Hi-C

[

25

,

52

,

53

]

B. napus

1.13 Gb

Sanger +Illumina + 454

[26]

B. juncea

var.

tumida

922 Mb

PacBio+ Illumina

[29]

B. oleracea

var.

capitata

630 Mb

Sanger +454 +Illumina

[27]

Raphanus sativus

528.6 Mb

Sanger + Illumina

[31]

Cucurbitaceae

cv 9930 (

C. s.

var.

sativus

)

367 Mb

Sanger + Illumina + PacBio + 10X Genomics + Hi-C

[

35

,

51

,

54

]

cv. Gy 14

367 Mb

454

[55]

cv. B10

367 Mb

Sanger + 454

[56]

PI 183967 (

C. sativus

var.

hardwickii

)

367 Mb

Sanger + Illumina

[36]

Cucumis melo

450 Mb

Sanger + 454

[

39

,

57

]

Citrullus lanatus

425 Mb

Illumina + PacBio + Hi-C

[

41

,

42

,

58

]

Cucurbita maxima

386.8 Mb

Illumina

[30]

Cucurbita moschata

372 Mb

[30]

Cucurbita pepo

spp.

pepo

283 Mb

[59]

Cucurbita agryrosperma

spp.

agryrosperma

238 Mb

Illumina + PacBio

[60]

Lagenaria siceraria

334 Mb

Illumina

[61]

Momordica charantia

339 Mb

Illumina , PacBio

[

62

,

63

]

Benincasa hispida

1.02 Gb

Illumina + PacBio

[64]

Fabaceae

Vigna unguiculata

560 Mb

PacBio

[65]

Phaseolus vulgaris

587 Mb

Sanger + 454 + Illumina

[66]

Amaranthaceae

Spinacia oleracea

1 Gb

Illumina

[67]

Beta vulgaris

spp.

vulgaris

714-758 Mb

Sanger + 454 + Illumina

[68]

Asteraceae

cv. Salinas (

Lactuca sativa

)

2.7 Gb

Illumina

[69]

Apiaceae

Daucus carota

spp.

sativus

473 Mb

Sanger + Illumina

[44]

Coriandrum sativum

2.13 Gb

PacBio + Illumina + 10X genomics + Hi-C

[70]

Asparagaceae

Asparagus officinalis

1.3 Gb

PacBio + Illumina

[71]

Moringaceae

Moringa oleifera

315 Mb

Illumina

[45]

1.3 NGS Approach in Single-Nucleotide Polymorphic Markers Development

Genotyping by sequencing (GBS) was used in tomato cultivars consisting of four types i.e., large fruited, cherry fruited, grape fruited, and rootstocks, to generate SNPs [72]. A total of 10,615 SNPs were generated and five subsets were made, out of which one subset comprising 224 markers showed polymorphism in 91 cultivars tested and also able to distinguish 139 F1 cultivars from that of the reference genome. The results suggest the usefulness of these markers in DNA barcoding for identification of varieties (Table 1.2). GBS was also used in C. annuum and 109,610 SNPs were generated. These were used in QTL mapping and GWAS for capsaicinoid content of peppers [73]. Five candidate genes were identified successfully, from 69 QTL regions, which are involved in capsaicinoid biosynthesis [74]. The sequencing of sweet potato varieties led to the generation of various SNPs and InDels, which are related to starch biosynthesis [75].

Table 1.2 List of markers used for studying different cultivars of vegetable crops.

Family

Species

Marker type

Number of markers

Method

References

Solanaceae

S. lycopersicum

SNP

10,615

GBS

[72]

S. lycopersicum

SNP

3614

WGS

[76]

S. lycopersicum× S. pennellii

S. lycopersicum × S. pimpinellifolium

SNP

141,083

GBS

[77]

S. lycopersicum

SNP

4,812,432

Modified SAM

[17]

S. pimpinellifolium

SNP

4,680,647

Modified SAM

[17]

S. lycopersicum

SNP

8,784

GBS

[78]

S. melongena S. incanum

SSR

11,26211,829

De novo

transcriptome

[79]

S. melongena

SNP

10,000

RAD-seq

[80]

C. annuum

SNP

1.76 million

Resequencing

[81]

C. annuum

InDel

14,498

Whole genome resequencing

[82]

C. annuum

SNP

109,610

GBS

[74]

S. tuberosum

SNP

575,340

De novo

transcriptome

[83]

S. tuberosum

SNPInDel

27 million3 million

Resequencing

[84]

Convolvulaceae

Ipomoea batatus

SNPInDel

62285

GBS

[75]

Brassicaceae

B. napus

SSR

21,523

GWAS

[85]

B. oleraceae

SNPInDel

496,46337,493

WGS

[86]

B. napus

SNP

37,721

GBS

[87]

R. sativus

SNP

52,559

Rad-seq

[88]

Cucurbitaceae

Cucurbita

spp.

SNP

37,869

GBS

[89]

C. melo

SNP

375

GBS

[90]

C. sativus

SSR

2171

GWAS

[91]

C. lanatus

SNP

203,894-279,412

WGRS

[92]

M. charantia

InDel

389,487

WGS

[93]

L. siceraria

SSR

45,066

Rad-seq

[94]

Fabaceae

P. vulgaris

SNPInDel

43,698 1267

WGS

[95]

V. unguiculata

SNP

1031

GBS

[96]

Amaranthaceae

Amaranthus

spp.

SNP

27,658

WGS

[97]

S. oleracea

SSR

3852

WGS

[98]

Apiaceae

D. carota

SNP

3636

WGS

[99]

1.4 Next-Generation Sequencing Approach in Trait-Specific Breeding

In red tomato skin, yellow-colored naringenin chalcone (NGC) of flavonoids determines the exterior color of the fruit [100], and this flavonoid also builds up naturally in the cuticle of red fruit skin as it ripens and causes the peel’s yellow color [101]. On the other hand, a transparent epidermis devoid of the yellow pigment NGC causes pink tomatoes. DNA markers obtained from the SlMYB12 gene of the Y locus on chromosome 1 would be helpful for marker-assisted selection (MAS) of tomato fruit color since NGC biosynthesis is controlled by this gene. The SlMYB12 gene, which has 4.9 kb, was transcribed from the line “FCR” (red-fruited YY) and the line “FCP” (pink-fruited YY) in order to create a gene-based marker. These SlMYB12 alleles’ sequence alignment showed no sequence differences between the “FCR” and “FCP” alleles. The DNA sequence of SlMYB12 was physically centered in between CAPS-456 and CAPS-38123, indicating that fruit peel color in cultivated tomato is controlled by SlMYB12 [102]. Most recently, the S. lycopersicum × S. pimpinellifolium RIL population was utilized to create a “ultra-high density” tomato genetic map. This population had 141,083 SNP markers divided into 2,869 genomic bins. Additionally, this map was employed for the fine mapping of genes and QTLs relating to tomato fruit weight and lycopene concentration [77]. A new high-density genetic bin map for tomato was created using Genotyping-by-Sequencing (GBS) and a distinct population of tomato plants (S. lycopersicum × S. pimpinellifolium). The map includes 1,195 genetic bins and 8,470 SNPs and was used to accurately locate the late blight resistance gene Ph-5 in tomatoes through fine mapping [24].

To identify single-nucleotide polymorphism (SNP) markers, a high-throughput genotyping by sequencing (GBS) in the 188 F5 population descended from the parents AR1 (PM resistant) and TF68 (PM susceptible) of Capsicum annum to provide powdery mildew resistance [103]. These SNP markers were then used to construct a genetic linkage map and perform QTL analysis. Each chromosome’s 1,308 SNP markers were used to create 12 linkage groups, with a total map length of 2506.8 cM. Moreover, two QTLs for Powdery mildew resistance, Pm-2.1 and Pm-5.1, were discovered on chromosome 2 and chromosome 5, respectively. Development of novel pepper cultivars with improved resistance to bacterial wilt disease will benefit from the identification of SNP markers linked to the resistance to the disease. Using the technique of whole-genome resequencing, Ahn et al.[104] discovered SNPs across the entire genome. Two pepper cultivars, Saengryeg 211 (sensitive) and 82PR66 (resistant), having different bacterial wilt resistance properties have genomes sequenced and compared to the reference sequence, C. annuum cv. CM334. The density of SNPs varied among the chromosomes, with the Saengryeg 211 and 82PR66 SNPs on chromosomes 10 and 11, respectively, having the highest density of SNPs. Intra- and inter-specific linkage maps for QTL mapping related to anthocyanin pigment in Brinjal (S. melongena) were studied by Barchi et al.[22]. SNPs were produced using high-throughput sequencing (Illumina) and restriction site-associated DNA (RAD). In order to study major breeding attributes in cucumber, such as fruit development [105, 106], parthenocarpy [107], the formation of trichomes that serve as a plant’s defense against biotic and abiotic stresses [108], primary plant regulatory processes in reaction to N deficit and transcriptome responses in C. sativus leaves [109], and the mechanism of melatonin-induced lateral root formation, the NGS technology and specific RNA-sequence methods are mentioned in Table 1.3.

In watermelon breeding, dwarfism is a valuable trait due to its contribution to higher yields and reduced labor in cultivation and harvesting. However, the genetic regulation for this trait is not well explained. To investigate this, researchers used NGS to analyze watermelon samples and identified a candidate dwarfism gene, Cla010726. They conducted a whole-genome re-sequencing from dwarf and vine pools DNA bulks (F2 population) and detected a genomic region containing the candidate gene through a genome-wide analysis of SNPs [112]. To develop a high-density linkage map for bitter gourd using genotyping-by-sequencing (GBS) technology and to perform QTL analysis for six major yield-contributing traits, studies were conducted by Rao et al.[113]. The study used a mapping population generated from the cross DBGy-201 × Pusa Do Mausami and identified 19 QTLs for the six quantitative traits. The QTLs derived from each parent had either a positive or a negative additive effect on trait scores. The phenotypic variation explained by the QTLs ranged from 0.09% to 32.65%, with a total of six major QTLs detected.

Table 1.3 Different NGS techniques used for targeted breeding of vegetable crops.

Sr. no.

Targeted breeding trait

NGS technique

References

1.

Fruit crops’ parthenocarpy, a crucial characteristic affecting yield and quality

RNA-seq HiSeq Illumina 2000

[107]

2.

Trichomes, which resemble epidermal hairs, serve as a plant’s defence mechanism for biotic and abiotic conditions

RNA-seq HiSeq Illumina 2000

[108]

3.

Early regulatory mechanism of plants in response to N starvation, transcriptome response of leaves under N deficiency

RNA-seq HiSeq Illumina 2000

[109]

4.

Molecular mechanisms of plant sex determination

RNA-seq Illumina

[110]

5.

Examine the factors that underlie melatonin-induced lateral root development in salt-stressed plants

RNA-seq Illumina

[111]

1.5 Next-Generation Sequencing Approach in Metagenomics

Metagenomics involves understanding of microbial and virus populations in targeted samples through a nucleic acid sequencing approach [114]. Very limited studies have been conducted in context to metagenomics of vegetable crops. Tomato production is threatened by Begomoviruses occurrence worldwide and therefore studies were made to understand the population dynamics of begomoviruses affecting tomato. Next-generation sequencing was employed to assess the diversity of single-stranded DNA viruses in tomatoes with or without the Ty-1 gene, which provides tolerance to begomoviruses. Leaf samples with begomovirus-like symptoms were enriched for circular DNA and sequenced using Illumina technology. Fifteen distinct viruses/subviral agents, mainly in mixed infections, suggest the implications of utilizing virus-specific resistance in tomato breeding. Additional viruses, including two distinct Begomovirus species, a new alpha-satellite, and a new gemycircularvirus, were found in tomatoes lacking the Ty-1 gene. The sole species of begomovirus that caused serious symptoms in Ty-1 crops was a unique begomovirus, which was only identified in the Ty-1 pool. This study sheds light on the potential adaptation of begomoviruses to Ty-1 and its effects on a subset of single-stranded DNA viral/subviral agents [115]. To investigate the microbial communities present in Chinese cabbage, NGS using 16S rRNA gene amplicon sequencing was employed. The results indicated that the bacterial communities on Chinese cabbage were primarily composed of Proteobacteria and Bacteroidetes, with Chryseobacterium, Aurantimonadaceae, Sphingomonas, and Pseudomonas being the most abundant genera. The study also detected diverse potential pathogens, such as Pantoea, Erwinia, Klebsiella, Yersinia, Bacillus, Staphylococcus, Salmonella, and Clostridium. Although further studies are required to determine the association between these potential pathogens and foodborne illness, the study suggests that metagenomic approaches can be used to detect pathogenic bacteria on fresh vegetables [116]. Gawande et al. [117] used the Illumina MiSeq platform to analyze the gut microbial community structure of adult Onion Thrips tabaci collected from 10 different agro-climatically diverse locations in India. The analysis revealed 1662 OTUs belonging to 21 bacterial phyla, with Proteobacteria being the predominant phylum. The Colorado potato beetle (CPB) is a serious insect pest that can develop resistance to insecticides. Using biological insecticides based on viruses may be a promising approach to control this pest, but there is limited information on viruses that infect leaf feeding beetles. The metagenomic analysis of 297 CPB genomic and transcriptomic samples was performed. The study discovered 3137 virus-positive contigs linked to various viruses from six types, 17 orders, and 32 families, corresponding to over 97 virus species. The indicated sequences were homologous to insect viruses, plant viruses, and endogenous retroviral element genetic sequences. The study’s findings imply that additional investigation may be useful in locating novel viruses to increase the range of biopesticides available to combat CPB [118].

1.6 Next-Generation Sequencing Approach in Transcriptomics

The transcriptome is the functional derivative of the genome and may be defined as the total amount of transcripts existing in a cell at a given developmental stage as well as a particular physiological condition. It comprises all the different RNA molecules present in a cell at a particular time. Therefore, the study of the transcriptome has the potential to reveal the functional role of the genome. In addition to this, transcriptome sequencing can yield information about the genes of an organism at a significantly lower cost than genome sequencing, as here, only the transcribed regions have to be sequenced and studied. Today, many research projects focus on the study of the transcriptome rather than the genome as only around 1–2% of genes in a genome are coding. Some of the recent transcriptomic studies that highlight the multidimensional applications of NGS in the transcriptomics of vegetable crops are described here in detail.

a) Discovery of Novel Genes

Transcriptome analysis has most significantly helped in the identification of several key genes that play an important role in living systems. Several such genes have been identified in vegetable crops. For example, the genes CsACS, CsAsr1, and CsIAA2, identified by transcriptome analysis, were found to be responsible for sex determination in cucumber [110]. Another set of genes, namely, CsCUC3 and CsSTM, also in cucumber, were found to regulate fruit spine development [108]. This has also been extended into some of the tissue-specific genes. The revelation of several genes specifically active in root and leaves in carrots, involved in anthocyanin biosynthesis, will help accelerate breeding attempts to improve the nutritive value of carrots [119].

b) Understanding the Molecular Mechanisms Underlying Biological Functions

Understanding ripening can help us breed fruits with a better shelf life, which retain their nutritional quality for long periods. Mutant LeMADS-RIN, a MADS-box gene, has been found to inhibit ripening in tomato. A complete understanding of this mutation has the potential to explain the genetic regulatory pathways that result in the ripening of fruits [120] compared to the fruit transcriptomes of a mutant and non-mutant genotype to get better insights into the role of ethylene in ripening. Mutant cauliflowers having green curds due to ectopic chloroplasts were subjected to genome-wide profiling of gene expression by RNA-seq. As a result, several genes associated with the development and differentiation of chloroplasts were discovered. The results hinted the significance of regulatory genes in light signaling pathways [121]. In less-studied vegetable crops such as sweet potatoes, characterization of the transcriptome helps us unravel the molecular aspects of cellular processes such as the development and differentiation of roots and leaves, differential gene expression in different tissues, and potential responses of the plant to biotic and abiotic stress [122]. All the genomic and transcriptomic data available for Daucus carota sp. Sativus were compiled into Carrot DB (Carrot Data Base). This database contains tools for Genome Mapping and Basic Local Alignment Search Tool (BLAST). This is a potent opportunity that has infinite possibilities. It may be used for the discovery of sequences of novel or putative genes or scaffolds, for the development of SSR markers, to predict the sequences of different proteins found in carrot, etc. [43].

c) Development of Markers

Trick et al.[123] developed a method of SNP discovery in the polyploid B. napus cultivars, using transcriptome analysis and the Illumina platform. Single gene sequences of Brassica were aligned with sequence reads with the help of MAQ software. Consequently, 23,330–42,593 putative SNPs with different read depths were detected. Out of these, ∼90% of the SNPs were hemi-SNPs, which means that they were homozygous in one line but heterozygous in the other [124]. These hemi-SNPs can serve as powerful tools for genetic mapping. Blanca et al.[125] carried out their studies in the unexplored C. pepo plant. They sequenced many genes, found several allelic variations, and developed SSR and SNP markers as a result of their study. Their findings will accelerate the breeding prospects of squash. Such marker discovery was also carried out by Nicolai et al.[126] in capsicum.

d) Understanding Abiotic Stress

Any abiotic stress such as high temperature, soil salinity, excess humidity, drought, and metal deficiency or toxicity has an adverse effect on plants, and plants fight these stresses by altering their genetic responses. As a result, on exposure to any abiotic stress, the gene expression pattern in plants changes drastically. A transcriptomic analysis of these changes has huge research potentials. Lee and Choi [127] carried out a comparative transcriptome analysis in non-stressed and cold stressed pepper and identified many genes, hormones, and processes involved in cold stress tolerance. Drought tolerance has been observed in potato plants transformed with the yeast trehalose-6-phosphate synthase (ScTPS1) gene. The transcriptome profiles of the transgenics and wild types can be compared for differential expression patterns of genes in the two types of plants under drought stress conditions [128]. The molecular response of radish to the toxic heavy metal lead was studied by Wang et al. [129] using RNAseq technology. The upregulated differentially expressed genes (DEGs) under lead stress were mainly found to be involved in glutathione metabolism-related processes and cell wall defense. On the other hand, the downregulated DEGs were predominantly involved in carbohydrate metabolism-related pathways.

e) Identifying Viruses Infecting Vegetables

When any plant is attacked by viruses or viroids, several specific RNA molecules are synthesized. These RNAs are 21 to 24 nucleotides long and are called short interfering RNAs (siRNA). These RNAs survey the cytoplasm for viral genetic sequence, which may be homologous to the inducer and upon recognition immediately pairs with it to destroy it. This phenomenon is called RNAi or RNA silencing. NGS of siRNAs has the potential for the identification of viruses or viroids that infect plants. This can be a very sensitive tool for the identification as it will work even in very minute concentrations, in case of infections that do not show any symptoms. Most importantly, this will help to discover viruses and viroids previously unknown to mankind.

1.7 Next-Generation Sequencing Approach in Exome and Captured Sequencing

Polyploidization has played a major role during evolution in shaping complex genomes, especially in vegetable crops. Therefore, vegetables, like Brassicas, generally have large genome sizes with a large number of repeats. Such large genome sizes are very difficult, inconvenient, and costly to sequence and analyze by whole-genome sequencing of a large number of individuals even after the advent of NGS. One way to solve this problem, without compromising on the amount of information gathered, is called “Sequence Capture” or “Targeted Sequencing”. In this technique, there is a need to sequence only 1–2% of high-value sequences. These may be sequences rich in functional part of the genome or having low repetitive regions. This means that they include either specific genes of interest or even targets within these genes. When the captured sequence represents the entire protein coding region of the genome, it is called the exome.

Since a major portion of the genome is composed of repetitive and non-coding regions, this technique substantially reduces the bulk of the DNA to be sequenced while providing almost the same amount of information. This allows us to deeply focus on the important regions of the genome that are ultimately responsible for the production of genetic variation.

Captured sequencing may be carried out by one of the following methods:

Hybridization based sequence capture

PCR-based amplification

Selective circularization

[130]

Out of these, the hybridization-based approach is most commonly used as it allows a large amount of DNA to be analyzed by virtue of it being simple, fast, inexpensive and requires a minimum input of DNA [131]. Hybridization-based exome capture may be done with the help of arrays or in solution. The arrays contain probes from the sequence library, which act as baits for the enrichment of the target sequence on the arrays that can be then sequenced by NGS techniques.

1.8 Applications of Exome and Captured Sequencing in Crop Research

Exome and captured sequencing can contribute to crop research in the following ways:

Identification of mutations in exomes and quick generation of novel and targeted polymorphisms

Exploration of biodiversity and gene mining

Development of SNP markers

Construction of genetic maps

Study the population structure, evolutionary history, and phylogenetic relationships

QTL mapping and gene identification

Genomic selection

At present, very limited studies using exome and captured sequencing have been conducted in vegetable crops.

In Manihot esculenta (Cassava), the transcriptome of 16 accessions was sequenced using the Illumina HiSeq platform. As a result, 675,559 EST-derived SNP markers were identified [132]. A subset of these markers were genotyped in 100 F1 progeny, which were developed by the cross of parents contrasting in starch viscosity, by capture-based targeted enrichment sequencing. As a result of this study, a major quantitative trait locus (QTL) regulating starch pasting properties was discovered. Moreover, a novel QTL associated with starch pasting time was identified. This information can further be used in research and breeding activities.

The nucleotide binding-site leucine-rich repeat (NBS-LRR) protein encodes for majority of disease resistance genes. NGS allows the identification of pathogen resistance gene families in this region. RenSeq technology was used in Solanum spp. (potato and tomato) and subsequently many NBS-LRRs were identified [133]. In such a way, new disease resistance genes can be identified.

Uitdewilligen et al.[134] used an in-solution hybridization method called “SureSelect” for DNA sequencing. This reduced the complexity of the tetraploid cultivars of potatoes and allowed them to focus on 807 target