Genotyping by Sequencing for Crop Improvement -  - E-Book

Genotyping by Sequencing for Crop Improvement E-Book

0,0
156,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

OGENOTYPING BY SEQUENCING FOR CROP IMPROVEMENT A thoroughly up-to-date exploration of genotyping-by-sequencing technologies and related methods in plant science In Genotyping by Sequencing for Crop Improvement, a team of distinguished researchers delivers an in-depth and current exploration of the latest advances in genotyping-by-sequencing (GBS) methods, the statistical approaches used to analyze GBS data, and its applications, including quantitative trait loci (QTL) mapping, genome-wide association studies (GWAS), and genomic selection (GS) in crop improvement. This edited volume includes insightful contributions on a variety of relevant topics, like advanced molecular markers, high-throughput genotyping platforms, whole genome resequencing, QTL mapping with advanced mapping populations, analytical pipelines for GBS analysis, and more. The distinguished contributors explore traditional and advanced markers used in plant genotyping in extensive detail, and advanced genotyping platforms that cater to unique research purposes are discussed, as is the whole-genome resequencing (WGR) methodology. The included chapters also examine the applications of these technologies in several different crop categories, including cereals, pulses, oilseeds, and commercial crops. Genotyping by Sequencing for Crop Improvement also offers: * A thorough introduction to molecular marker techniques and recent advancements in the technology * Comprehensive explorations of the genotyping of seeds while preserving their viability, as well as advances in genomic selection * Practical discussions of opportunities and challenges relating to high throughput genotyping in polyploid crops * In-depth examinations of recent advances and applications of GBS, GWAS, and GS in cereals, pulses, oilseeds, millets, and commercial crops Perfect for practicing plant scientists with an interest in genotyping-by-sequencing technology, Genotyping by Sequencing for Crop Improvement will also earn a place in the libraries of researchers and students seeking a one-stop reference on the foundational aspects of - and recent advances in - genotyping-by-sequencing, genome-wide association studies, and genomic selection.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 900

Veröffentlichungsjahr: 2022

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Copyright Page

Dedication

List of Contributors

Preface

1 Molecular Marker Techniques and Recent Advancements

1.1 Introduction

1.2 What is a Molecular Marker?

1.3 Classes of Molecular Markers

1.4 Sequencing‐based Markers

1.5 Recent Advances in Molecular Marker Technologies

1.6 SNP Databases

1.7 Application of Molecular Markers

1.8 Summary

References

2 High‐throughput Genotyping Platforms

2.1 Introduction

2.2 SNP Genotyping Platforms

References

3 Opportunity and Challenges for Whole‐Genome Resequencing‐based Genotyping in Plants

3.1 Introduction

3.2 Basic Steps Involved in Whole‐Genome Sequencing and Resequencing

3.3 Whole‐Genome Resequencing Mega Projects in Different Crops

3.4 Whole‐Genome Pooled Sequencing

3.5 Pinpointing Gene Through Whole‐Genome Resequencing‐based QTL Mapping

3.6 Online Resources for Whole‐Genome Resequencing Data

3.7 Applications and Successful Examples of Whole‐Genome Resequencing

3.8 Challenges for Whole‐Genome Resequencing Studies

3.9 Summary

References

4 QTL Mapping Using Advanced Mapping Populations and High‐throughput Genotyping

4.1 Introduction

4.2 The Basic Objectives of QTL Mapping

4.3 QTL Mapping Procedure

4.4 The General Steps for QTL Mapping

4.5 Factors Influencing QTL Analysis

4.6 QTL Mapping Approaches

4.7 Statistical Methods for QTL Mapping

4.8 Software for QTL Mapping

4.9 Bi‐parental Mapping Populations

4.10 QTL Mapping Using Bi‐parental Populations

4.11 Multiparental Mapping Populations

4.12 QTL Mapping Using Multiparental Populations

4.13 Use of High‐throughput Genotyping for QTL Mapping

4.14 Next‐Generation Sequencing‐based Genotyping

4.15 Challenges with QTL Mapping Using Multiparental Populations and High‐throughput Genotyping

References

5 Genome‐Wide Association Study: Approaches, Applicability, and Challenges

5.1 Introduction

5.2 Methodology to Conduct GWAS in Crops

5.3 Statistical Modeling in GWAS

5.4 Efficiency of GWAS with Different Marker Types

5.5 Computational Tools for GWAS

5.6 GWAS Challenges for Complex Traits

5.7 Factors Challenging the GWAS for Complex Traits

5.8 GWAS Applications in Major Crops

5.9 Candidate Gene Identification at GWAS Loci

5.10 Meta‐GWAS

5.11 GWAS vs. QTL Mapping

References

6 Genotyping of Seeds While Preserving Their Viability

6.1 Introduction

6.2 Genotyping‐by‐Sequencing with Minimum DNA

6.3 DNA Extraction from Half Grain

6.4 GBS with Half Seed

6.5 Applications of GBS as Diagnostic Tool

6.6 Summary

References

7 Genomic Selection: Advances, Applicability, and Challenges

7.1 Introduction

7.2 Natural Selection

7.3 Breeding Selection

7.4 Marker‐assisted Selection

7.5 Genomic Selection

7.6 Genotyping for Genomic Selection

7.7 Integration of Genomic Selection in MAS Program

7.8 The Efficiency of Genomic Selection for Complex Traits

7.9 Integration of Genomic Selection in the Varietal Trial Program

7.10 Cost Comparison of GS vs MAS

References

8 Analytical Pipelines for the GBS Analysis

8.1 Introduction

8.2 Applications of NGS

8.3 NGS Sequencing Platforms

8.4 Tools for NGS Data Analysis

8.5 Generalized Procedure for NGS Data Analysis

8.6 Variant Annotation

8.7 Role of NGS Informatics in Identifying Variants

8.8 Genotyping by Sequencing

8.9 Analytical Pipelines for GBS

8.10 Comparison of GBS Pipelines

References

9 Recent Advances and Applicability of GBS, GWAS, and GS in Maize

9.1 Introduction

9.2 Maize Genetics

9.3 Importance of Genomics and Genotyping‐based Applications in Maize Breeding Programs

9.4 GBS‐based QTL Mapping in Maize

9.5 GBS Protocols and Analytical Pipelines for Maize

9.6 Maize Genome Sequencing and Resequencing

9.7 Genotyping‐by‐Sequencing‐based GWAS and GS Efforts in Maize

9.8 Summary

References

10 Recent Advances and Applicability of GBS, GWAS, and GS in Soybean

10.1 Introduction

10.2 GBS Efforts in Soybean

10.3 High‐Density Linkage Maps in Soybean

10.4 GBS Protocols and Analytical Pipelines for Soybean

10.5 GBS‐based QTL Mapping Efforts in Soybean

10.6 Soybean Genome Sequencing and Resequencing

10.7 GBS‐based GWAS Efforts in Soybean

10.8 GBS‐based Genomic Selection Efforts in Soybean

References

11 Advances and Applicability of Genotyping Technologies in Cotton Improvement

11.1 Introduction

11.2 Challenges due to Polyploidy in Cotton

11.3 Applications of Genomics and Genotyping for Cotton Breeding Programs

11.4 Genotyping Efforts in Cotton

11.5 High‐Density Linkage Maps in Cotton

11.6 Whole‐Genome Sequencing of Cotton Germplasm

11.7 Application of GBS Technology in Cotton Research

11.8 GBS‐based Bi‐Parental QTL Mapping and Association Mapping in Cotton

11.9 Summary and Outlook

References

12 Recent Advances and Applicability of GBS, GWAS, and GS in Millet Crops

*

12.1 Introduction

12.2 GBS Efforts in Millet Crops

12.3 High‐density Linkage Maps in Millet Crops

12.4 GBS‐based QTL Mapping Efforts in Millet Crops

12.5 Genome Sequencing and Resequencing of Millet Crops

12.6 GBS‐based GWAS Efforts in Millet Crops

12.7 GBS‐based Genomic Selection (GS) Efforts in Millet Crops

12.8 Summary

References

13 Recent Advances and Applicability of GBS, GWAS, and GS in Pigeon Pea

13.1 Introduction

13.2 Pigeon Pea Sequencing and Resequencing

13.3 Development of Pigeon Pea High‐density Genotyping Platforms

13.4 Development of High‐density Linkage Maps in Pigeon Pea

13.5 QTL Analysis Using High‐density Genotyping Platforms and GBS

13.6 GWAS Efforts in Pigeon Pea

13.7 Genomic Selection (GS) Efforts in Pigeon Pea

13.8 Summary

References

14 Opportunity and Challenges for High‐throughput Genotyping in Sugarcane

14.1 Introduction

14.2 Sugarcane Genome and Genetics

14.3 Genetic Studies and Marker Systems

14.4 Genotyping‐by‐Sequencing (GBS)

14.5 SNP Calling Using GBS Pipelines

14.6 Sugarcane Genome Sequencing

14.7 Linkage and QTL Mapping in Sugarcane

14.8 GWAS in Sugarcane

14.9 Genomic Selection in Sugarcane

14.10 Summary

References

15 Recent Advances and Applicability of GBS, GWAS, and GS in Polyploid Crops

15.1 Introduction

15.2 Challenges for Genotyping in Polyploidy Crops

15.3 Genotyping Platforms for Barley

15.4 Long‐Read Sequencing‐based Genotyping in Polyploid Canola

15.5 Peanut Genotyping with Targeted Amplicon Sequencing

15.6 SNP Genotyping Methods and Platforms Available for Sugarcane

15.7 Recent Advances and Applicability of GBS, GWAS, and GS in Polyploidy Crop Species

15.8 Haplotype‐based Genotyping

15.9 GBS Analytical Pipelines for Polyploids

15.10 GBS‐based QTL Mapping Efforts in Polyploids

15.11 GWAS and GS Using High‐throughput Genotyping in Polyploidy Crops

References

16 Recent Advances and Applicability of GBS, GWAS, and GS in Oilseed Crops

16.1 Introduction

16.2 GBS Efforts in Oilseed Crops

16.3 High‐density Linkage Maps for Oilseed Crops

16.4 GBS Protocols and Analytical Pipelines

16.5 GBS‐based QTL Mapping Efforts in Oilseed Crops

16.6 GBS‐based GWAS Efforts in Oilseed Crops

References

Index

End User License Agreement

List of Tables

Chapter 1

Table 1.1 Details of the other important molecular markers.

Table 1.2 Comparison between different marker techniques commonly used in p...

Table 1.3 List of important online SNP databases.

Chapter 2

Table 2.1 Customized SNP array details in plant species.

Chapter 4

Table 4.1 The different software used for quantitative loci (QTL) mapping....

Table 4.2 Studies that utilized high‐throughput genotyping for QTL mapping....

Chapter 5

Table 5.1 Popular bioinformatics software and tools available for GWAS anal...

Table 5.2 Genome‐wide association studies (GWAS) conducted for dissection o...

Table 5.3 Candidate genes identified through genome‐wide association studie...

Chapter 7

Table 7.1 Table showing different crop plants where GBS was used (Adapted f...

Table 7.2 Table showing various GS studies carried out in different crops (...

Chapter 8

Table 8.1 Different variant identification tools.

Table 8.2 Tools for variant annotation.

Chapter 9

Table 9.1 Linkage map developed in maize using genotyping by sequencing (GB...

Table 9.2 List of GBS‐based QTL mapping studies in maize.

Table 9.3 GBS‐based GWAS efforts in maize.

Chapter 10

Table 10.1 High‐throughput genomics platforms used for soybean genotyping....

Table 10.2 List of GBS‐based QTL mapping studies in soybean.

Table 10.3 Details of efforts performed for whole‐genome resequencing and r...

Table 10.4 List of GBS‐based GWAS studies in soybean.

Chapter 11

Table 11.1 List of genomic resources available in cotton.

Table 11.2 Development of various interspecific and intraspecific linkage (...

Table 11.3 List of genome wide association studies (GWAS) in cotton.

Table 11.4 List of GBS‐based QTL mapping studies in cotton.

Chapter 12

Table 12.1 High‐throughput genomic platforms used for genotyping of millet ...

Table 12.2 List of GBS‐based QTL mapping studies in millet crops.

Table 12.3 Details of efforts performed for whole‐genome resequencing and r...

Table 12.4 Summarized millet genome assembly statistics.

Table 12.5 The list of GBS‐based GWAS studies in millets.

Chapter 13

Table 13.1 Whole‐genome resequencing studies in pigeon pea.

Table 13.2 List of high‐density genotyping platforms developed for pigeon p...

Table 13.3 High‐density linkage maps generated in pigeon pea by using the r...

Table 13.4 List of significant QTLs identified using GBS and high‐density S...

Chapter 14

Table 14.1 Studies on linkage and QTL mapping in sugarcane through GBS‐base...

Table 14.2 Studies on GWAS for various traits in sugarcane using GBS marker...

Chapter 16

Table 16.1 List of GBS‐based QTL in conducted in oilseed crops.

Table 16.2 List of GBS‐based GWAS studies in oilseed crops.

List of Illustrations

Chapter 1

Figure 1.1 An example of GBS and GBS data analysis workflow for identificati...

Figure 1.2 Steps in KASP reaction: (a) annealing: allele‐specific primer bin...

Chapter 2

Figure 2.1 A pipeline for SNP discovery (S1–S10 are different diverse access...

Figure 2.2 A schematic representation of different SNP genotyping technologi...

Figure 2.3 Illustration of various steps involved in the generation of RAD‐b...

Figure 2.4 Schematic illustration of work‐flow in a MALDI‐TOF MS

Chapter 3

Figure 3.1 Diagrammatic representation of various high‐throughput‐sequencing...

Figure 3.2 Genome‐wide association studies (GWAS) in rice seedling for salt‐...

Chapter 4

Figure 4.1 Steps involved in bulked segregant analysis (BSA) used for QTL ma...

Figure 4.2 Steps involved in MutMap approach used to map the QTLs of target ...

Figure 4.3 Steps involved in the QTL‐seq approach used to map the QTLs of th...

Figure 4.4 Steps involved in bulked segregant RNA sequencing (BSR‐Seq) used ...

Figure 4.5 Steps involved in Indel sequencing used to map the QTLs of the ta...

Figure 4.6 Schematic representation of different types of biparental mapping...

Figure 4.7 Genotyping of segregating population using KASPar assay.

Figure 4.8 Genotyping of segregating population using Sequenom MassARRAY sys...

Chapter 5

Figure 5.1 Methodology to conduct GWAS in crops. It can be divided into thre...

Figure 5.2 (a) The figure illustrating the traits for which genome‐wide asso...

Chapter 6

Figure 6.1 DNA extraction from seed endosperm. * 

Figure 6.2 Seed DNA‐based genotyping‐by‐sequencing using laser microdissecti...

Chapter 7

Figure 7.1 Overview of estimate marker effects in order to get a genomic est...

Figure 7.2 The methodology involved in marker‐assisted section (MAS) and gen...

Chapter 8

Figure 8.1 Evolution of next‐generation sequencing.

Figure 8.2 Sequencing and assembly of DNA.

Figure 8.3 The workflow showing the steps involved in NGS data analysis.

Chapter 10

Figure 10.1 World soybean production and productivity in 2019–2020. (a) Prod...

Figure 10.2 World soybean oil production and soymeal export in the year 2019...

Figure 10.3 Various pipelines and steps are involved in analyzing GBS data. ...

Figure 10.4 Schematic representations of steps involved in association mappi...

Figure 10.5 General steps involved in the GBS protocol for plant breeding....

Figure 10.6 Diagrammatic representation of application of genotyping by sequ...

Chapter 11

Figure 11.2 Name of the nine intra‐specific and four inter‐specific datasets...

Figure 11.1 Integrated genomics and breeding approaches for cotton improveme...

Chapter 12

Figure 12.1 Schematic representation of the important characteristics of mil...

Figure 12.2 Applications of whole‐genome sequence (WGS) of millets.

Figure 12.3 General steps involved in genomic selection.

Chapter 13

Figure 13.1 Depicting the genomics and phenomics for the exploitation of pig...

Chapter 14

Figure 14.1 GBS adapters, PCR and sequencing primers. (a) Sequences of doubl...

Figure 14.2 Steps in GBS library construction. Note: Up to 96 DNA samples ca...

Figure 14.3 Schematic representation of genomic selection processes from tra...

Chapter 15

Figure 15.1 GBS data of seven barley chromosomes 1H–7H showing genetic diver...

Figure 15.2 Illustration to depict that the simplex markers show similar mod...

Figure 15.3 Diagrammatic overview of UGbs‐Flex Pipeline developing GBS refer...

Chapter 16

Figure 16.1 Genomic distribution of single‐nucleotide polymorphism (SNPs) ma...

Guide

Cover Page

Title Page

Copyright Page

Dedication

List of Contributors

Preface

Table of Contents

Begin Reading

Index

WILEY END USER LICENSE AGREEMENT

Pages

iii

iv

v

xv

xvi

xvii

xviii

xix

xx

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

Genotyping by Sequencing for Crop Improvement

Edited by

Humira Sonah

National Agri‐Food Biotechnology Institute

Punjab, India

Vinod Goyal

CCS Haryana Agriculture University

Hisar, India

S.M. Shivaraj

Laval University

Quebec City, QC, Canada

Rupesh K. Deshmukh

National Agri‐Food Biotechnology Institute

Punjab, India

This edition first published 2022© 2022 John Wiley & Sons Ltd

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Humira Sonah, Vinod Goyal, S.M. Shivaraj, and Rupesh K. Deshmukh to be identified as the authors of the editorial material in this work has been asserted in accordance with law.

Registered OfficesJohn Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USAJohn Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

Editorial OfficeThe Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print‐on‐demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of WarrantyThe contents of this work are intended to further general scientific research, understanding, and discussion only and are not intended and should not be relied upon as recommending or promoting scientific method, diagnosis, or treatment by physicians for any particular patient. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of medicines, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each medicine, equipment, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Cataloging‐in‐Publication Data

Names: Sonah, Humira, editor. | Goyal, Vinod, editor. | Shivaraj, S. M., editor. | Deshmukh, Rupesh editor.Title: Genotyping by sequencing for crop improvement / edited by Humira Sonah, Vinod Goyal, S. M. Shivaraj, Rupesh K. Deshmukh.Description: First edition. | Hoboken, NJ, USA : John Wiley & Sons, Inc.,  [2022] | Includes bibliographical references and index.Identifiers: LCCN 2021046855 (print) | LCCN 2021046856 (ebook) | ISBN  9781119745655 (hardback) | ISBN 9781119745662 (adobe pdf) | ISBN  9781119745679 (epub)Subjects: LCSH: Genetics–Technique. | Gene mapping. | Genomics. | Plant  genomes.Classification: LCC QK981.45 .G46 2021 (print) | LCC QK981.45 (ebook) |  DDC 572.8/62–dc23/eng/20211001LC record available at https://lccn.loc.gov/2021046855LC ebook record available at https://lccn.loc.gov/2021046856

Cover Design: WileyCover Image: © Billion Photos/Shutterstock

Dedicated to the two most eminent agricultural scientists of Canada whose work in plant genomics and breeding helped in food security and inspired many young scientists worldwide.

Prof. Richard Bélanger Département de phytologie Université Laval, Canada

Prof. François Belzile Département de phytologie Université Laval, Canada

Dr. Humira SonahDr. Vinod GoyalDr. S.M. ShivarajDr. Rupesh K. Deshmukh

List of Contributors

Alish Alisha, Department of Gene Expression, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland

Gagandeep Singh Bajwa, Department of Plant Breeding and Genetics, Punjab Agricultural University, Ludhiana, Punjab, India

Vitthal T. Barvkar, Department of Botany, Savitribai Phule Pune University, Pune, Maharashtra, India

Shubham Bhardwaj, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

National Institute of Plant Genome Research (NIPGR), New Delhi, India

Dharminder Bhatia, Department of Plant Breeding and Genetics, Punjab Agricultural University, Ludhiana, Punjab, India

Bharat Char, Mahyco Research Centre, Mahyco Private Limited, Jalna, Maharashtra, India

Viswanathan Chinnusamy, Division of Plant Physiology, ICAR‐IARI, New Delhi, India

Shalu Choudhary, Mahyco Research Centre, Mahyco Private Limited, Jalna, Maharashtra, India

Rupesh K. Deshmukh, Agricultural Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Vikas Devkar, Department of Plant and Soil Science, Institute of Genomics for Crop Abiotic Stress Tolerance (IGCAST), Texas Tech University, Lubbock, TX, USA

Pallavi Dhiman, Department of Agriculture Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Kishor Gaikwad, ICAR – National Institute for Biotechnology, New Delhi, India

Naina Garewal, Department of Biotechnology, Panjab University, Chandigarh, India

Dhananjay Narayanrao Gotarkar, International Rice Research Institute, Los Baños, Philippines

Md Aminul Islam, Department of Botany, Majuli College, Majuli, Assam, India

Priyanka Jain, ICAR – National Institute for Biotechnology, New Delhi, India

Riya Joon, Department of Biotechnology, Panjab University, Chandigarh, India

Swapnil B. Kadam, Department of Botany, Savitribai Phule Pune University, Pune, Maharashtra, India

Ravindra Ramrao Kale, ICAR‐Indian Institute of Rice Research, Hyderabad, Telangana, India

Ravneet Kaur, Department of Biotechnology, Panjab University, Chandigarh, India

Suneetha Kota, ICAR‐IIRR, Hyderabad, Telangana, India

Amit Kumar, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Kuldeep Kumar, ICAR – Indian Institute of Pulses Research, Kanpur, Uttar Pradesh, India

Manish Kumar, Department of Seed Science and Technology, Dr. Yashwant Singh Parmar University of Horticulture and Forestry, Solan, Himachal Pradesh, India

Sandeep Kumar, Xcelris Lab Pvt Ltd., Ahmedabad, Gujarat, India

Virender Kumar, Department of Agriculture Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Surbhi Kumawat, Department of Agriculture Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Brij Kishore Kushwaha, Department of Molecular Biology and Genetic Engineering, Bihar Agricultural University, Sabour Bhagalpur, Bihar, India

Omkar Maharudra Limbalkar, ICAR‐Division of Genetics, Indian Agriculture Research Institute, New Delhi, India

Rushil Mandlik, Department of Agriculture Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Venugopal Mikkilineni, Mahyco Research Centre, Mahyco Private Limited, Jalna, Maharashtra, India

Pankaj S. Mundada, Department of Botany, Savitribai Phule Pune University, Pune, Maharashtra, India

Department of Biotechnology, Yashavantrao Chavan Institute of Science, Satara Maharashtra, India

Narender Negi, Department of Fruit Science, ICAR‐NBPGR Regional Station, Shimla, Himachal Pradesh, India

Anupama A. Pable, Department of Microbiology, Savitribai Phule Pune University, Maharashtra, India

Gunashri Padalkar, Department of Agriculture Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Jayendra Padiya, Mahyco Research Centre, Mahyco Private Limited, Jalna, Maharashtra, India

Arushi Padiyal, Department of Seed Science and Technology, Dr. Yashwant Singh Parmar University of Horticulture and Forestry, Solan, Himachal Pradesh, India

Brajendra Parmar, ICAR‐IIRR, Hyderabad, Telangana, India

Gunvant B. Patil, Department of Plant and Soil Science, Institute of Genomics for Crop Abiotic Stress Tolerance (IGCAST), Texas Tech University, Lubbock, TX, USA

Vinaykumar Rachappanavar, Department of Seed Science and Technology, Dr. Yashwant Singh Parmar University of Horticulture and Forestry, Solan, Himachal Pradesh, India

Department of Agriculture, MS Swaminathan School of Agriculture, Shoolini University, Solan, Himachal Pradesh, India

Nitika Rajora, Department of Agriculture Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Santosh Rathod, ICAR‐IIRR, Hyderabad, Telangana, India

Gaurav Raturi, Department of Agriculture Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Rita, ICAR – National Institute for Biotechnology, New Delhi, India

Akshay S. Sakhare, Division of Plant Physiology, ICAR‐IARI, New Delhi, India

Swati Saxena, ICAR – National Institute for Biotechnology, New Delhi, India

Senthilkumar Shanmugavel, Crop Improvement Division, ICAR – Sugarcane Breeding Institute, Coimbatore, Tamil Nadu, India

Jitender Kumar Sharma, Department of Agriculture, School of Agriculture, Baddi University of Emerging Sciences & Technology, Baddi, Himachal Pradesh, India

Sandhya Sharma, ICAR – National Institute for Biotechnology, New Delhi, India

Shivani Sharma, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Yogesh Sharma, Department of Agriculture Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Prashant Raghunath Shingote, Vasantrao Naik College of Agricultural Biotechnology, Dr. Panjabrao Deshmukh Krishi Vidyapeeth, Akola, Maharashtra, India

Harsha Srivastava, ICAR – National Institute for Biotechnology, New Delhi, India

Anuradha Singh, Department of Genomics, ICAR – National Institute on Plant Biotechnology, New Delhi, India

Kashmir Singh, Department of Biotechnology, Panjab University, Chandigarh, India

Manipal Singh, Department of Agriculture Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Nisha Singh, Department of Genomics, ICAR – National Institute on Plant Biotechnology, New Delhi, India

Avinash Singode, ICAR – Indian Institute of Millets Research, Hyderabad, Telangana, India

Sweta Sinha, Department of Molecular Biology and Genetic Engineering, Bihar Agricultural University, Sabour Bhagalpur, Bihar, India

Sreeja Sudhakaran, Department of Agriculture Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Lakshmipathy Thalambedu, Crop Improvement Division, ICAR – Sugarcane Breeding Institute, Coimbatore, Tamil Nadu, India

Vandana Thakral, Department of Agriculture Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Prathima P. Thirugnanasambandam, Crop Improvement Division, ICAR – Sugarcane Breeding Institute, Coimbatore, Tamil Nadu, India

Anshuman Tiwari, Mahyco Research Centre, Mahyco Private Limited, Jalna, Maharashtra, India

Kishor Tribhuvan, ICAR – Indian Institute of Agricultural Biotechnology, Ranchi, Jharkhand, India

Abhijit Ubale, Mahyco Research Centre, Mahyco Private Limited, Jalna, Maharashtra, India

Sanskriti Vats, Agricultural Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, IndiaRegional Centre for Biotechnology, Faridabad, Haryana (NCR Delhi), India

Joshita Vijayan, ICAR – National Institute for Biotechnology, New Delhi, India

Dhiraj Lalji Wasule, Vasantrao Naik College of Agricultural Biotechnology, Dr. Panjabrao Deshmukh Krishi Vidyapeeth, Akola, Maharashtra, India

Himanshu Yadav, Department of Agriculture Biotechnology, National Agri‐Food Biotechnology Institute (NABI), Mohali, Punjab, India

Preface

Recent advances in sequencing technology and computational resources have accelerated genomics and translational research in crop science. The technological advances have provided many opportunities in genomics‐assisted plant breeding to address issues related to food security. Among the several applications, genotyping‐by‐sequencing (GBS) technology has evolved as one of the frontier areas facilitating high‐throughput plant genotyping. The GBS approaches have proved effective for the utilization in genotyping‐based applications like quantitative trait loci (QTL) mapping, genome‐wide association study (GWAS), genomic selection (GS), and marker‐assisted breeding (MAB). Considering the current affairs in plant breeding, we decided to compile the advances in GBS methods, statistical approaches to analyze the GBS data, and its applications including QTL mapping, GWAS, and GS in crop improvement.

Presently, the food produced around the world is adequate for the existing population. However, the constantly increasing population mounting pressure on a food production system. Hence efficient utilization of technological advances and existing knowledge is essential to enhance food production to match the growing food demand. In this direction, most of the countries around the globe have adopted advanced genomic methodologies to breed superior plant genotypes. Among such technological advances, the high‐throughput genotyping using GBS has shown promising results in different crop plants. The GBS has predominantly been used for germplasm evaluation, evolutionary studies, development of dense linkage map, QTL mapping, GWAS, GS, and MAB. The cost‐effectiveness and whole‐genome coverage make GBS more reliable than other next‐generation sequencing (NGS) techniques.

This book describes advanced molecular markers, high‐throughput genotyping platforms, whole‐genome resequencing (WGR), QTL mapping using advanced mapping populations, analytical pipelines for the GBS analysis, advances in GWAS, advances in GS, application of GBS, GWAS, and GS in different crop plants. The different marker types including traditional and advanced markers used in plant genotyping have been presented in great detail. DNA extraction directly from seeds without germination can save time and effort. Several modified and crop‐specific nondestructive seed DNA extraction protocols have been compiled and presented. Many advanced genotyping platforms are now available which cater to specific research purposes because of the differences in terms of reaction chemistry involved, cost, method of signal detection, and flexibility in the protocols. Such advanced platforms along with their principles have been discussed. The WGR methodology and available resources have been covered in detail. The WGR has emerged as a powerful method to identify genetic variation among individuals. The recent advancement in WGR includes pool‐Seq which provides an alternative to individual sequencing and a cost‐effective method for GWAS. Compared to biparental populations the multi‐parental population provides an opportunity to interrogate multiple alleles and to provide an increased level of recombination and mapping resolution of QTLs. The use of such improved populations in the era of high‐throughput genotyping has been presented in one of the chapters. The dedicated section focused on the basic principle of GWAS, the efficiency of different markers, candidate gene identification, meta‐GWAS, and statistical methods involved in GWAS analysis has been included. For genetic mapping, and marker‐assisted selection, rapid and quality DNA isolation is mandatory to accelerate the whole process. A focused section about GS has been included which gives an account of the basic concept, advances, applicability, and challenges of GS. Similarly, a separate chapter is included which discusses the analytical pipelines used for GBS data. Application of technologies such as GBS, GWAS, and GS in different crop categories like cereals, pulses, oilseeds, and commercial crops has been discussed in different chapters.

Here, we have tried to compile basic aspects and recent advances in GBS, GWAS, and GS in plant breeding. We believe that the book will be helpful to researchers and scientists to understand and plan future experiments. This book will enable plant scientists to explore GBS application more efficiently for basic research as well as applied aspects in various crops improvement projects.

EditorsDr. Humira SonahDr. Vinod GoyalDr. S. M. ShivarajDr. R. K. Deshmukh

1Molecular Marker Techniques and Recent Advancements

Dharminder Bhatia and Gagandeep Singh Bajwa

Department of Plant Breeding and Genetics, Punjab Agricultural University, Ludhiana, Punjab, India

1.1 Introduction

Plant selection and systematic breeding efforts led to the development of present‐day improved cultivars of crop plants. From a historical perspective, increased crop yield is the result of genetic improvement (Fehr 1984). Markers play an important role in the selection of traits of interest. Markers can be morphological, biochemical, or molecular in nature. Morphological markers are visual phenotypic characters such as growth habit of the plant, seed shape, seed color, flower color etc. Biochemical markers are the isozyme‐based markers characterized by variation in molecular form of enzyme showing a difference in mobility on an electrophoresis gel. Very few morphological and biochemical markers are available in plants, and they are influenced by developmental stage and environmental factors. Since a large number of economically important traits are quantitative in nature, which are affected by both genetic and environmental factors, the morphological and biochemical markers‐based selection of traits may not be much reliable. The subsequent discovery of abundantly available DNA‐based markers made possible the selection of almost any trait of interest. DNA‐based markers are not affected by the environment. Besides, these markers are highly reproducible across labs and show high polymorphism to distinguish between two genetically different individuals or species.

In the last four decades, DNA‐based molecular marker technology has witnessed several advances from low throughput hybridization‐based markers to high‐throughput sequencing‐based markers. These advances have been possible due to critical discoveries such as polymerase chain reaction (PCR) (Mullis et al. 1986), Sanger sequencing method (Sanger et al. 1977), automation of Sanger sequencing (Shendure et al. 2011), next‐generation sequencing (NGS) technologies (Mardis 2008), and development of bioinformatics tools. This chapter will briefly discuss different types of molecular markers while particularly focusing on recent developments in molecular marker technologies. These developments have expedited the mapping and cloning of several loci governing important traits, precise trait selection, and transfer into elite germplasm.

1.2 What is a Molecular Marker?

DNA or molecular marker is a fragment of the DNA that is associated with a particular trait in an individual. These molecular markers aid in determining the location of genes that control key traits.

Generally, molecular markers do not represent the gene of interest but act as “flags” or “signs.” Similar to genes, all the molecular markers occupy a specific position within the chromosomes. Molecular markers located close to genes (i.e. tightly linked) are referred to as “gene tags.”

DNA‐based molecular markers are the most widely used markers predominantly due to their abundance. They arise from different classes of DNA mutations such as substitution mutations (point mutations), rearrangements (insertions or deletions), or errors in replication of tandemly repeated DNA. These markers are selectively neutral because they are usually located in noncoding regions of DNA. Unlike morphological and biochemical markers, DNA markers are practically unlimited in number and are not affected by environmental factors and/or the developmental stage of the plant.

DNA markers show genetic differences that can be visualized by using a gel electrophoresis technique and staining ethidium bromide or hybridization with radioactive or colorimetric probes. Markers that can identify the difference between two individuals are referred to as polymorphic markers, whereas those that do not distinguish the individuals are called monomorphic markers. Based on how polymorphic markers can discriminate between individuals, they are described as codominant or dominant. Codominant markers indicate differences in size whereas dominant markers reveal differences based on their presence or absence. The different forms of a DNA marker in the form of band size on gels are known as marker “alleles.” Dominant marker has only two alleles whereas codominant markers may have many alleles.

1.3 Classes of Molecular Markers

Based on the method of their detection, DNA markers are broadly classified into three groups: (i) hybridization‐based, (ii) PCR‐based, and (iii) DNA sequence‐based molecular markers. Molecular markers have been discussed earlier in several reviews (Collard et al. 2005; Semagn et al. 2006; Gupta and Rustgi 2004) and book chapters (Mir et al. 2013; Singh and Singh 2015), which readers can also consult for more details. However, a brief description of each of these markers has been presented below.

1.3.1 Hybridization‐based Markers

1.3.1.1 Restriction Fragment Length Polymorphism (RFLP)

These are the first molecular markers used by Grodzicker et al. (1975) in adenovirus and Botstein et al. (1980) in human genome mapping. These were first used in plants by Helentjaris et al. (1986). In this type of marker, polymorphism is detected by cutting DNA into fragments by the use of restriction enzymes followed by hybridization of radioactively labeled DNA probes which are single or low copy DNA fragments and visualized by autoradiography. DNA probes could be genomic clones, cDNA clones, or even cloned genes. The RFLP markers show co‐dominance and are highly reliable in linkage analysis and breeding (Semagn et al. 2006). However, this technique requires a large quantity of DNA, labor‐intensive, relatively expensive, and hazardous. RFLP shows polymorphism in two different species if they differ due to point mutations, insertion/deletion, inversion, translocation, and duplication.

1.3.1.2 Diversity Array Technology (DArT™)

This is a high‐throughput DNA polymorphism analysis method which combines microarray and restriction‐based PCR methods. It is similar to AFLP where hybridization is used for the detection of polymorphism. It can able to provide a comprehensive genome coverage even in those organisms not having genome sequence information (Jaccoud et al. 2001). Diversity array technology (DArT) is a solid‐state open platform method for analyzing DNA polymorphism. DArT procedure includes (i) Generating a diversity panel and (ii) Genotyping using a diversity panel. The diversity panel is generated using a set of lines representing the breadth of variability in germplasm (~10 lines). An equal quantity of DNA from each representative line is pooled followed by restriction with two to three restriction endonucleases (REs) and ligation of RE‐specific adaptors. Later DNA fragments are amplified using adaptor complementary primers. The representation fragments are ligated to vector and transformed into Escherichia coli cells. The transformed cells with recombinant DNA are selected and amplified using M13 forward and reverse primer. The amplified DNA is isolated and purified. The purified DNA is coated onto polylysine‐coated glass slides to generate a diversity array.

For genotyping, the representation fragments of the target genotypes are prepared in the same as in the diversity panel. The DNA fragments are column purified and fluorescently labeled with two different dyes (Cy3 or Cy5). The labeled DNA fragments are used for hybridization onto the diversity array. Two representative panels – one labeled with Cy3 and another with Cy5 – can be hybridized simultaneously and hybridization signal intensities are measured for each spot. DArT, thus detects DNA polymorphism at several hundred genomic loci in a single array without relying on sequence information.

1.3.2 Polymerase Chain Reaction (PCR)‐based Markers

1.3.2.1 Simple‐Sequence Repeats (SSRs)

Simple‐sequence repeats (SSRs) (Litt and Luty 1989) are also known as microsatellites or short tandem repeats (STRs) or simple sequence length polymorphism (SSLP). These are widely used markers and are also referred to as the mother of all the markers. These are STRs, generally of one to eight nucleotide length. These are found dispersed throughout the genome and are hypervariable. These repeat regions are flanked with unique sequences that are highly conserved. The flanking unique sequences are used to design complementary primers which can be assayed with PCR. SSRs are highly polymorphic and codominant markers. These show polymorphism as a result of the variable number of repeat units. Before the era of genome sequencing, it was difficult to develop SSRs due to the extensive cost and labor involved in the identification of repeat regions and flanking unique sequences. However, with the availability of genome sequences of several organisms, the development of SSR has become very easy which involves in silico identification of STRs, designing of SSR from flanking unique sequences, and validation through experimentation. SSR markers have shown immense application in population genetic analysis, gene mapping, and cloning due to their abundance in the genome and high polymorphism, and very high reproducibility across labs. SSR‐based linkage maps have been developed in several important crop plants such as rice (Temnykh et al. 2000; McCouch et al. 2002; Orjuela et al. 2010), wheat (Roder et al. 1998), maize (Sharopova et al. 2002), potato (Milbourne et al. 1998), etc.

1.3.2.2 Sequence‐Tagged Sites (STSs)

Sequence‐tagged sites (STSs) were first developed for physical mapping of the human genome by Olsen et al. (1989). STS is the short unique sequences developed from polymorphic RFLP probe or AFLP fragment which is linked to desirable traits. The RFLP probes or AFLP fragments showing polymorphism are end‐sequenced and primers are designed to specifically amplify these fragments. STS markers are co‐dominant and highly reproducible. For example, STS markers have been developed for RFLP markers linked with bacterial blight resistance genes xa5, xa13, and Xa21 (Huang et al. 1997). One major limitation of these types of markers is the reduced polymorphism than the corresponding RFLP probe.

1.3.2.3 Randomly Amplified Polymorphic DNAs (RAPDs)

Williams et al. (1990) first developed these markers to amplify DNA without prior sequence information. In this type of marker, the arbitrary decamer sequences are used as primers at low annealing temperatures for DNA amplification. These markers are referred to as dominant markers because the polymorphism is determined based on the presence or absence of a particular amplified fragment. Polymorphism may also be due to varying brightness of bands at a particular locus due to copy number differences. These markers have been used for constructing linkage maps in several species (Hunt 1997; Laucou et al. 1998) and also for tagging genes of economic importance. However, due to the dominant nature, these may not be appropriate for genetic mapping and marker‐assisted selection (MAS). One major limitation of these markers is the lack of repeatability in certain cases. Variations of RAPD include AP‐PCR (arbitrarily primed PCR) and DAF (DNA amplification fingerprinting (Table 1.1).

Table 1.1 Details of the other important molecular markers.

Marker

Description

Variable number tandem repeat (VNTR) or minisatellites

A short DNA sequence (10–100 bp) is present as tandem repeats and is a highly variable copy number

DNA amplification fingerprinting (DAF)

A variation of RAPD, where 4–5 bp single and arbitrary primer is used to detect polymorphism

Arbitrary‐primed PCR (AP‐PCR)

A variation of RAPD, where 18–32 bp long single and arbitrary primer is used to detect polymorphism

Inter‐simple sequence repeat (ISSR)

Primers are designed based on the repeat region of microsatellites. These primers are used to amplify the region between two microsatellites. The stretches of unique DNA in between or flanking the SSRs are amplified. A single SSR‐based primer is used to prime PCR

Selective amplification of microsatellite polymorphic loci (SAMPL)

A modification of ISSR, where SSR‐based primer is used along with AFLP primer. The template is identical to the AFLP template and the rare cutter primer is replaced by SSR‐based primer

Cleaved amplified polymorphic sequences (CAPS)

These markers are also called PCR‐RFLP, where amplified PCR product is digested with endonucleases to reveal polymorphism. These are used when PCR product does not show polymorphism and restriction enzyme site present in amplified PCR product may detect polymorphism

Derived cleaved amplified polymorphic sequences (dCAPS)

A variation of CAPS, where a primer containing one or more mismatches to template DNA is used to create a restriction enzyme recognition site in one allele but not in another due to the presence of SNP. Thus, obtained PCR product is subjected to restriction enzyme digestion to find the presence or absence of the SNP

Single‐strand conformational polymorphism (SSCP)

DNA fragments of size ranging from 200 to 800 bp were amplified by PCR using specific primers (20–25 bp), followed by gel‐electrophoresis of single‐strand DNA to detect nucleotide sequence variation. The method is based on a principle that the secondary structure of single‐strand DNA molecule changes significantly if it harbors mutation. This method detects nucleotide variation without sequencing a DNA sample

Denaturing/temperature gradient gel electrophoresis (DGGE, TGGE)

These methods reveal polymorphism due to differential movement of the same genomic double‐stranded region with different base‐pair composition. As an example, the AT‐rich region would have a lower melting temperature than the GC‐rich region

Target region amplification polymorphism (TRAP)

This method employs primers designed from the EST database for detecting polymorphism around a selected candidate gene. This includes two primers of 18 bp, one of which is designed from targeted EST and the other is an arbitrary primer

1.3.2.4 Sequence Characterized Amplified Regions (SCARs)

These markers overcome the limitation of RAPDs. In this case, the RAPD fragments that are linked to a gene of interest are cloned and sequenced. Based on the terminal sequences, longer primers (20 mer) are designed. These SCAR primers more specifically amplify a particular locus. These are similar to STS markers in design and application. The presence or absence of the band indicates variation in sequences. The SCAR markers thus are dominant in nature. These, however, can be converted to codominant markers in certain cases by digesting the amplified fragment with tetranucleotide recognizing restriction enzymes. There are several examples where the RAPD markers linked to the gene of importance have been converted to SCAR markers (Joshi et al. 1999; Liu et al. 1999; Kasai et al. 2000; Akkurt et al. 2007; Chao et al. 2018).

1.3.2.5 Amplified Fragment Length Polymorphism (AFLP)

This marker technique was developed by Vos et al. (1995) and is patented by Keygene (www.keygene.com). In this technique, DNA is cut into fragments by a combination of restriction enzymes which are frequent (four bases) and rare (six bases) cutters that generate restriction overhangs on both sides of fragments. This is followed by the annealing of double‐stranded oligonucleotide adapters of a few oligonucleotide bases with respective restriction overhangs. The oligonucleotide adapters are designed in such a way that the original restriction sites are not reinstated and also provide the PCR amplification sites. The fragments are PCR amplified and visualized on agarose gel. This method produces many restriction fragments enabling the polymorphism detection. The number of amplified DNA fragments can be controlled by selecting different number or composition of bases in the adapters. The stringent reaction conditions used for primer annealing make this technique more reliable. This method is a combination of both RFLP and PCR techniques and is extremely useful in the detection of polymorphism between closely related genotypes. Like RAPD, AFLP is a dominant marker and is not preferred for genetic mapping studies and MAS. AFLP maps have been constructed in several species and integrated into already existing RFLP maps e.g. tomato (Haanstra et al. 1999), rice (Cho et al. 1997), and wheat (Lotti et al. 2000).

1.3.2.6 Expressed Sequence Tags (ESTs)

These markers are developed by end sequencing (generally 200–300 bp) of random cDNA clones. The sequence thus obtained is referred to as expressed sequence tags (ESTs). A large number of ESTs have been synthesized in several crop plants and are available in the EST database at NCBI (https://www.ncbi.nlm. nih.gov/dbEST/). These markers were originally developed to identify gene transcripts and have played important role in the identification of several genes and the development of markers such as RFLP, SSR, SNPs, CAPS, etc. (Semagn et al. 2006). However, EST‐based SSRs show less polymorphism as compared to genomic DNA‐based SSRs. Since EST markers are from expressed sequence regions, these are highly conserved among the species and can be used for synteny mapping. Most of these could also be functional genes. A large number of EST markers have been used in rice for developing a high‐density linkage map (Harushima et al. 1998) and for chromosome bin mapping in wheat using deletion stocks (Qi et al. 2003). In addition to these, several other molecular marker variants have been developed. The description of those markers is presented in Table 1.1.

1.4 Sequencing‐based Markers

1.4.1 Single‐Nucleotide Polymorphisms (SNPs)

Single‐nucleotide polymorphisms (SNPs) are more abundant resulted from single‐base pair variations. These are evenly distributed in a whole genome that can tag almost any gene or locus of a genome (Brookes 1999). However, the distribution of SNPs varies among species with 1 SNP per 60–120 bp in maize (Ching et al. 2002) and 1 SNP per 1000 bp in humans (Sachidanandam et al. 2001). SNPs are more prevalent in the noncoding region. In the coding region, SNPs could be synonymous or nonsynonymous. In synonymous SNPs, there is no change in the amino acid resulting in no phenotypic differences. However, phenotypic differences could be produced due to modified mRNA splicing (Richard and Beckman 1995). In nonsynonymous SNPs, change in amino acid results in phenotypic differences. SNPs are mostly bi‐allelic and cause polymorphism due to nucleotide base substitution. The two types of nucleotide base substitutions result in SNPs. A transition substitution occurs between purines (A, G) or between pyrimidines (C, T). This type of substitution constitutes two‐thirds of all SNPs. A transversion substitution occurs between a purine and pyrimidine. SNPs can be detected by the alignment of the similar genomic region of two different species. The SNPs have only two alleles compared to typical multiallele SSLP; however, this disadvantage can be compensated by using the high density of SNPs.

1.4.2 Identification of SNP in a Pregenomic Era

Initially, identification of SNP markers was laborious and expensive and involved allele‐specific sequencing (Ganal et al. 2009). This includes sequencing of unigene‐derived amplicons using Sanger’s method from two or more than two lines. In an experiment, about 350 bp of the RFLP clone, A‐519 was end sequenced in soybean and the flanking amplification primers were designed (Coryell et al. 1999). Primers were used to screen for allele diversity using PCR from ten genotypes and the amplicons were sequenced followed by sequence comparison to identify SNP. SNPs were also identified through mining a large number of EST sequences in EST databases, which are generated through improved sequencing technologies (Soleimani et al. 2003). These SNPs are further validated using PCR (Batley et al. 2003). These approaches allowed the identification of mainly gene‐based SNPs, but their frequency is generally low. Additionally, SNPs located in low‐copy noncoding regions and intergenic spaces could not be identified.

Several assays have been developed for genotyping based on identified SNPs which include, allele‐specific hybridization, primer extension, oligonucleotide ligation, and invasive cleavage (Sobrino et al. 2005). Besides, DNA chips, allele‐specific PCR, and primer extension were also attractive options since these are suitable for automation and can be used for the development of dense genetic maps. Allele‐specific hybridization was used for the identification of polymorphism in 570 genotypes of soybean (Coryell et al. 1999).

1.5 Recent Advances in Molecular Marker Technologies

The improvement of Sanger sequencing technology in the 1990s combined with the beginning of EST and genome sequencing projects in model plants led to the spurt in the identification of variation at the single‐base resolution (Wang et al. 1998). From 2005 onward, the emergence of NGS platforms such as Roche 454, Illumina HiSeq2500, ABI 5500xl SOLiD, Ion Torrent, PacBio RS, Oxford Nanopore, and advances in bioinformatics tools simplified the process of identification of genome‐wide SNPs and changed the face of molecular marker technology. NGS‐based genotyping platforms such as genotyping‐by‐sequencing (GBS), whole‐genome resequencing (WGR), and high‐density SNP arrays helped to type thousands of SNPs in a single reaction in hundreds of individuals.

1.5.1 Genotyping‐by‐Sequencing (GBS)

GBS is an NGS‐based reduced representation sequencing technique for the identification of genome‐wide SNPs and genotyping large populations (Bhatia et al. 2013). GBS is a one‐step approach for the identification and utilization of markers in a single reaction. It is a complexity reduction procedure where a combination of restriction enzymes is used to separate low copy sequences from high copy repetitive regions. In general, GBS involves the sequencing of fragments generated through restriction digestion of the genome on the NGS platform. In this process, the DNA of the population is digested with RE followed by ligation of RE‐specific adaptors containing genotype‐specific barcode sequences and sites for binding PCR and sequencing primers (Figure 1.1). The fragments thus generated can be PCR amplified and an equal volume of PCR product from different individuals are pooled in a tube. The fragments in the pool can be selected based on their size and sequenced on the NGS platform. The choice of restriction enzymes depends upon the complexity and size of the genome. Presently, different versions of GBS are available, which includes RAD‐seq (restriction associated DNA sequencing), ddRAD‐seq (double‐digest restriction associated sequencing), SLAF‐seq (specific‐locus amplified fragment sequencing), Rest‐seq (restriction DNA sequencing), Skim GBS (skim‐based GBS) (Bhatia 2020). These versions differ with respect to fragment size selection, the extent of complexity reduction, and genome coverage. Since GBS is a population‐dependent genotyping method, to make it cost‐effective a low‐depth sequencing is adopted which caused a high rate of missing data. The low‐depth sequencing makes it an ineffective genotyping approach in heterozygous populations. GBS has low genome coverage due to reduced representation sequencing.

Figure 1.1 An example of GBS and GBS data analysis workflow for identification of SNP markers.

GBS is being widely used to capture SNPs and other marker variations by NGS. GBS overtook the conventional genotyping procedures involving the use of traditional markers such as RAPD, AFLP, SSR, and many others in terms of time, labor, and cost involved. As an example, GBS can generate data of thousands of markers in a large population in a week, which can be analyzed in a month (Bhatia et al. 2018). The approach has been utilized in the mapping of several economically important traits in a number of crop plants (Poland and Rife 2012). Most of the developing countries have in‐house computational facilities that are being used for GBS analysis. Few online servers are also available, where GBS analysis can be done using in‐built pipelines such as cyverse (www.cyverse.org); however, these are unable to analyze the large dataset. Further speed of analysis depends upon the internet speed. Alignment of NGS‐based reads and calling SNPs and Indels are the two major steps in GBS analysis, for which several pipelines are available publically such as Stacks, IGST, GB‐eaSY, TASSEL‐GBS, FAST‐GBS, UNEAK, etc. (Wickland et al. 2017).

Another important pipeline widely used for NGS data analysis is dDocent pipeline (www.dDocent.com) which is a simple bash wrapper to quality analysis, assemble, map, and call SNPs from almost any kind of RAD sequencing (Puritz et al. 2014). However, most of these pipelines are hard to code for a student with little bioinformatics background. Most of these pipelines vary with respect to the complexity of the genome and computational space required. Besides there are several bioinformatics tools such as BWA, Bowtie2, SAM tools, GATK, BCFtools including a set of Perl utility scripts (Kagale et al. 2016) that can be used for GBS data analysis. However, there should be knowledge of the installation and usage of these tools for proper utilization in data analysis. With the advancements in NGS approaches, GBS has become a widely used approach in plant breeding and genetics, particularly for understanding complex quantitative traits.

DArT‐seq GBS (https://www.diversityarrays.com/technology‐and‐resources/dartseq/) somehow overcomes the limitation of the missing data point. The technique is an extension of traditional DArT technology where DArT representations are sequenced on the NGS platform. The fragment sequencing enables a dramatic increase in the number of genomic fragments analyzed and an increase in the number of reported markers thus making it a cost‐effective technology than the initial DArT method.

1.5.2 Whole‐Genome Resequencing (WGR)

WGR with high coverage and depth overcomes the limitations of GBS due to missing data points and heterozygous calls. In general, WGR involves the sequencing of enough DNA fragments (>5×–20×) to cover the whole genome of an organism. Due to sequencing cost, the technique is suitable in crop plants having smaller genome sizes such as rice. In such cases, GBS can be replaced by resequencing of a larger size population at 5–6× depth. However, WGR for few samples can be done at a much higher read depth of 10–20× as in the case of the BSA‐seq approach (Nguyen et al. 2019). One of the important BSA‐seq‐based approaches is quantitative trait loci (QTL)‐seq developed by Takagi et al. (2015) in rice. Later this technique has been widely used in several crop plants. Takagi et al. (2015) developed a pipeline for analysis of the whole genome sequence of bulks and identification of causative variants. WGR has been used in several studies for identification of genome‐wide SNPs, genotyping mapping populations for construction of high‐density linkage maps and QTL mapping, linkage and genome‐wide association studies (GWASs), of reference genome improvement, and genomic selection (Poland and Rife 2012; Bhatia et al. 2013; Chung et al. 2017; Nguyen et al. 2019).

1.5.3 SNP Arrays

Along with GBS, high‐density DNA array‐based SNP chips or SNP arrays have become a widely used SNP detection platform for high multiplex genotyping. SNP arrays work by hybridization of DNA fragments with allele‐specific oligonucleotide probes (SNP probes) and fluorescence‐based detection of signals. In general, SNP arrays can be roughly categorized into two types based on SNP detection methods: (i) nonenzymatic differential hybridization including allele‐specific hybridization, (ii) enzymatic reactions including primer extension, and mini‐sequencing (Ding and Jin 2009). For making SNP arrays, the first step is the identification of genome‐wide SNPs by sequencing (preferably WGR) of a large diverse panel. The SNPs arrays may include SNPs from coding (genic) regions only and/or genome‐wide SNPs from other noncoding regions. SNPs are in silico validated with several custom tools and final filtered SNPs are identified. The oligonucleotide probes containing SNP alleles are designed and bound on a solid glass plate surface. SNP chips can be custom designed commercially from two widely used platforms: Affymetrics (www.affymetrics.com) as Axiom Affymetrics SNP Chips (Affymetrix/Thermo Fisher Axiom®) or Illumina (https://www.illumina.com/science/technology/microarray.html) as Immunia Infinium assay (Illumina Infinium®). Affymetrics SNP array relies on differential hybridization due to different melting temperatures for matched and mismatched SNPs binding to target DNA sequence. On the other hand, Illumina Infinium assay uses Illumina BeadArray technology that relies on primer extension to distinguish two SNP alleles. The Affymetrix SNP array uses 25‐mer for SNP calling while the Illumina BeadArray uses 50‐mer for target capture. In rice, a high‐resolution 44K Affymetrix array, 50K Infinium array, and 700K high‐density rice array are available for rice SNP genotyping (McCouch et al. 2010; Tung et al. 2010; Chen et al. 2013; McCouch et al. 2015). Additionally, high‐density SNP arrays have been developed for other crop plants such as maize (Ganal et al. 2011) and sunflower (Bachlava et al. 2012) as well as domestic animal species, including cattle (Gibbs et al. 2009; Matukumalli et al. 2009) and pig (Ramos et al. 2009). One major advantage of SNP arrays is the reproducibility of data points where GBS does have some shortcomings. However, the disadvantage is the less polymorphism as compared to GBS and WGR and detection of only alleles present in the array (Table 1.2).

Table 1.2 Comparison between different marker techniques commonly used in plant research.

SSR

GBS

WGR

SNP array

KASP™

DNA quality

Moderate

High

High

High

High

PCR‐based

Yes

Yes

No

No

No

Allele detection

High

High

High

Low

Low

Polymorphism

High

High

High

Low

Low

Ease to use

Easy

Not easy

Not easy

Easy

Easy

Reproducibility

High

Low

High

High

High

Cost

Moderate

Low to moderate

High

High

moderate

Cost for analysis

High

High

High

Low

Low

Suitability for different approaches

Genetic diversity analysis

High

Moderate

High (cost concerns)

High

High

Bi‐parental QTL mapping

High

High

High

High

High

Genome wide association analysis

Moderate

High

High

High

Low

Genomic selection

Low

Moderate

High (cost concerns)

High

Low

1.5.4 Kompetitive Allele‐Specific PCR (KASP™)

KASP™ is a trademark technology of KBiosciences (http://www.kbioscience.co.uk/) or LGC genomics (http://www.lgcgenomics.com