Bioinformatics in Aquaculture -  - E-Book

Bioinformatics in Aquaculture E-Book

0,0
186,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Bioinformatics derives knowledge from computer analysis of biological data. In particular, genomic and transcriptomic datasets are processed, analysed and, whenever possible, associated with experimental results from various sources, to draw structural, organizational, and functional information relevant to biology. Research in bioinformatics includes method development for storage, retrieval, and analysis of the data.

Bioinformatics in Aquaculture provides the most up to date reviews of next generation sequencing technologies, their applications in aquaculture, and principles and methodologies for the analysis of genomic and transcriptomic large datasets using bioinformatic methods, algorithm, and databases. The book is unique in providing guidance for the best software packages suitable for various analysis, providing detailed examples of using bioinformatic software and command lines in the context of real world experiments.

This book is a vital tool for all those working in genomics, molecular biology, biochemistry and genetics related to aquaculture, and computational and biological sciences.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 1188

Veröffentlichungsjahr: 2017

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Title Page

Copyright

About the Editor

List of Contributors

Preface

Part I: Bioinformatics Analysis of Genomic Sequences

Chapter 1: Introduction to Linux and Command Line Tools for Bioinformatics

Introduction

Overview of Linux

Directories, Files, and Processes

Environment Variables

Basic Linux Commands

Installing Software Packages

Accessing a Remote Linux Supercomputer System

Demonstration of Command Lines

Further Reading

Chapter 2: Determining Sequence Identities: BLAST, Phylogenetic Analysis, and Syntenic Analyses

Introduction

Determining Sequence Identities through BLAST Searches

Web-based BLAST

UNIX-based BLAST

Determining Sequence Identities through Phylogenetic Analysis

Procedures of Phylogenetic Analysis

Determining Sequence Identities through Synthetic Analysis

References

Chapter 3: Next-Generation Sequencing Technologies and the Assembly of Short Reads into Reference Genome Sequences

Introduction

Understanding of DNA Sequencing Technologies

Preprocessing of Sequences

Sequence Assembly

Scaffolding

Gap Filling (Gap Closing)

Evaluation of Assembly Quality

References

Chapter 4: Genome Annotation: Determination of the Coding Potential of the Genome

Introduction

Methods Used in Gene Prediction

Case Study: Genome Annotation Examples: Gene Annotation of Chromosome 1 of Zebrafish using FGENESH and AUGUSTUS

Discussion

References

Chapter 5: Analysis of Repetitive Elements in the Genome

Introduction

Methods Used in Repeat Analysis

Software for Repeat Identification

Using the Command-line Version of RepeatModeler to Identify Repetitive Elements in Genomic Sequences

References

Chapter 6: Analysis of Duplicated Genes and Multi-Gene Families

Introduction

Pipeline Installations

Identification of Duplicated Genes and Multi-Member Gene Family

Downstream Analysis

Perspectives

References

Chapter 7: Dealing with Complex Polyploidy Genomes: Considerations for Assembly and Annotation of the Common Carp Genome

Introduction

Properties of the Common Carp Genome

Genome Assembly: Strategies for Reducing Problems Caused by Allotetraploidy

Annotation of Tetraploidy Genome

Conclusions

References

Part II: Bioinformatics Analysis of Transcriptomic Sequences

Chapter 8: Assembly of RNA-Seq Short Reads into Transcriptome Sequences

Introduction

RNA-Seq Procedures

Reference-Guided Transcriptome Assembly

De novo

Transcriptome Assembly

Assessment of RNA-Seq Assembly

Conclusions

Acknowledgments

References

Chapter 9: Analysis of Differentially Expressed Genes and Co-expressed Genes Using RNA-Seq Datasets

Introduction

Analysis of Differentially Expressed Genes Using CLC Genomics Workbench

Analysis of Differentially Expressed Genes Using Trinity

Analysis of Co-Expressed Genes

Computational Challenges

Acknowledgments

References

Chapter 10: Gene Ontology, Enrichment Analysis, and Pathway Analysis

Introduction

GO and the GO Project

Enrichment Analysis

Gene Pathway Analysis

References

Chapter 11: Genetic Analysis Using RNA-Seq: Bulk Segregant RNA-Seq

Introduction

BSR-Seq: Basic Considerations

BSR-Seq Procedures

Acknowledgments

References

Chapter 12: Analysis of Long Non-coding RNAs

Introduction

Data Required for the Analysis of lncRNAs

Assembly of RNA-Seq Sequences

Identification of lncRNAs

Analysis of lncRNA Expression

Analysis and Prediction of lncRNA Functions

Future Perspectives

References

Chapter 13: Analysis of MicroRNAs and Their Target Genes

Introduction

miRNA Biogenesis and Function

Tools for miRNA Data Analysis

miRNA Analysis: Computational Identification from Genome Sequences

miRNA Analysis: Empirical Identification by Small RNA-Seq

Prediction of miRNA Targets

Conclusions

References

Chapter 14: Analysis of Allele-Specific Expression

Introduction

Genome-wide Approaches for ASE Analysis

Applications of ASE Analysis

Considerations of ASE Analysis by RNA-Seq

Step-by-Step Illustration of ASE Analysis by RNA-Seq

References

Chapter 15: Bioinformatics Analysis of Epigenetics

Introduction

Mastering Epigenetic Data

Histone Modifications

Genomic Data Manipulation

DNA Methylation and Bioinformatics

Histone Modifications and Bioinformatics

Perspectives

References

Part III: Bioinformatics Mining and Genetic Analysis of DNA Markers

Chapter 16: Bioinformatics Mining of Microsatellite Markers from Genomic and Transcriptomic Sequences

Introduction

Bioinformatics Mining of Microsatellite Markers

Primer Design for Microsatellite Markers

Conclusions

References

Chapter 17: SNP Identification from Next-Generation Sequencing Datasets

Introduction

SNP Identification and Analysis

Detailed Protocols of SNP Identification

References

Chapter 18: SNP Array Development, Genotyping, Data Analysis, and Applications

Introduction

Development of High-density SNP Array

SNP Genotyping: Biochemistry and Workflow

SNP Genotyping: Analysis of Axiom Genotyping Array Data

SNP Analysis After Genotype Calling

Applications of SNP Arrays

Conclusion

Further Readings

References

Chapter 19: Genotyping by Sequencing and Data Analysis: RAD and 2b-RAD Sequencing

Introduction

Methodology Principles

The Experimental Procedure of 2b-RAD

Bioinformatics Analysis of RAD and 2b-RAD Data

Example for Running a Linkage Mapping Analysis

The Benefits and Pitfalls of RAD and 2b-RAD Applications

References

Chapter 20: Bioinformatics Considerations and Approaches for High-Density Linkage Mapping in Aquaculture

Introduction

Basic Concepts

Requirements for Genetic Mapping

Linkage Mapping Process

Step-by-Step Illustration of Linkage Mapping

Pros and Cons of Linkage Mapping Software Packages

References

Chapter 21: Genomic Selection in Aquaculture Breeding Programs

Introduction

Genomic Selection

Steps in GS

Models for Genomic Prediction

An Example of Implementation of Genomic Prediction

Some Important Considerations for GS

GS in Aquaculture

Acknowledgment

References

Chapter 22: Quantitative Trait Locus Mapping in Aquaculture Species: Principles and Practice

Introduction

DNA Markers and Genotyping

Linkage Maps

Quantitative Trait Loci (QTL) Mapping

References

Chapter 23: Genome-wide Association Studies of Performance Traits

Introduction

Study Population

Phenotype Design

Power of Association Test and Sample Size

Quality Control Procedures

LD Analysis

Association Test

Significance Level for Multiple Testing

Step-by-Step Procedures: A Case Study in Catfish

Follow-up Work after GWAS

Pitfalls of GWAS with Aquaculture Species

Comparison of GWAS with Alternative Designs

Conclusions

References

Chapter 24: Gene Set Analysis of SNP Data from Genome-wide Association Studies

Introduction

GSA in GWAS

Statistical Methods

Demonstration Using Alzheimer's Disease Neuroimaging Initiative's AD Data

Conclusion

References

Part IV: Comparative Genome Analysis

Chapter 25: Comparative Genomics Using CoGe, Hook, Line, and Sinker: Using CoGe Tools for Catching Fish Genome Evolution

Introduction

Getting Hooked into the CoGe Platform

Casting the Line: Analyses for Comparing Genomes

Sinkers to Cast Further and Deeper: Adding Weight to Genomes with Additional Data Types

Conclusions

References

Part V: Bioinformatics Resources, Databases, and Genome Browsers

Chapter 26: NCBI Resources Useful for Informatics Issues in Aquaculture

Introduction

Popularly Used Databases in NCBI

Popularly Used Tools in NCBI

Submit Data to NCBI

References

Chapter 27: Resources and Bioinformatics Tools in Ensembl

Introduction

Ensembl Resources

Comparative Genomics

Ensembl Regulation

Ensembl Tools

Assembly Converter

ID History Converter

BioMart

References

Chapter 28: iAnimal: Cyberinfrastructure to Support Data-driven Science

Introduction

Background

iAnimal Resources

The Data Store

DE

Atmosphere: Accessible Cloud Computing for Researchers

The Agave API

Bio-Image Semantic Query User Environment: Analysis of Images

Selecting the Right Tool for the Job

Reaching Across the Aisle: Federated Third-Party Platforms

AgBase

VCmap

CoGe

How to Find Help Using iAnimal

Coming Soon to CyVerse

Conclusion

Acknowledgments

References

Index

End User License Agreement

Pages

xxiii

xxv

xxvi

xxvii

xxviii

xxix

xxxi

xxxii

xxxiii

1

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

123

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

275

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

461

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

489

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

547

548

549

550

551

552

553

554

555

556

557

Guide

Cover

Table of Contents

Preface

Begin Reading

List of Tables

Chapter 1: Introduction to Linux and Command Line Tools for Bioinformatics

Table 1.1 The options of chmod command

Table 1.2 List of octal numbers for file permissions

Table 1.3 A list of examples of environment variables

Table 1.4 A list of frequently used tar options

Chapter 2: Determining Sequence Identities: BLAST, Phylogenetic Analysis, and Syntenic Analyses

Table 2.1 Basic types of BLAST

Table 2.2 Commonly used multiple sequence alignment software

Table 2.3 The phylogenetic tree construction method

Table 2.4 Commonly used synteny analysis software and their characteristics

Chapter 4: Genome Annotation: Determination of the Coding Potential of the Genome

Table 4.1 A list of gene prediction software packages, modified from http://en.wikipedia.org/wiki/List_of_gene_prediction_software

Table 4.2 Accuracy of annotation of

C. elegans

using various gene-prediction programs. Accuracy values are from Coghlan

et al

. (2008), including sensitivity (Sn) and specificity (Sp) for the prediction of base, exon, transcript, and gene

Chapter 5: Analysis of Repetitive Elements in the Genome

Table 5.1 Search tools used for STR detection. Modified from Merkel and Gemmell (2008)

Table 5.2 Summary of programs for finding interspersed repeats. Modified from Saha

et al

. (2008b)

Chapter 8: Assembly of RNA-Seq Short Reads into Transcriptome Sequences

Table 8.1 Examples of fish species whose reference genome sequences are available

Chapter 9: Analysis of Differentially Expressed Genes and Co-expressed Genes Using RNA-Seq Datasets

Table 9.1 Summary of the comparisons between microarray and RNA-Seq

Chapter 12: Analysis of Long Non-coding RNAs

Table 12.1 The availability of RNA-Seq datasets from fish and aquaculture species in the SRA database as of December 2014

Chapter 13: Analysis of MicroRNAs and Their Target Genes

Table 13.1 A list of tools for

in-silico

miRNA prediction

Table 13.2 A list of tools for miRNA analysis of deep sequencing data

Table 13.3 List of tools for prediction of RNA secondary structures

Table 13.4 List of tools for prediction of miRNA targets

Table 13.5 A list of tools for miRNA–mRNA integrated analysis

Table 13.6 Options of MapMi (version 1.5.9-Build32)

Table 13.7 Example of MapMi output. The example is one miRNA identification extracted from this output of miRNA identification from the channel catfish genome (Ipg1.fa) using all vertebrate mature miRNAs as queries

Table 13.8 A list of major scripts and options in the miRDeep2 package

Table 13.9 A list of miRanda options for miRNA target prediction

Chapter 14: Analysis of Allele-Specific Expression

Table 14.1 Example of

cis

- and

trans

-regulatory analysis. ID is gene ID; PA1 is the allele count in parent 1; PA2 is the allele count in parent 2; F1A1 is the count of the first allele in F1; F1A2 is the count of the second allele in F1

Table 14.2 Example of genes showing imprinting patterns. ID is gene ID; Hyb_reads1: reads count for allele 1 in F1 hybrid; Hyb_reads2: reads count for allele 2 in F1 hybrid; Rec_Reads1: reads count of allele 1 in reciprocal hybrid; Rec_Reads2: reads count of allele 2 in reciprocal hybrid; P1 is the expression percentage from the maternal allele in F1 hybrid; and P2 is the expression percentage from the paternal allele in reciprocal hybrid

Chapter 15: Bioinformatics Analysis of Epigenetics

Table 15.1 Tools for mapping high-throughput sequencing data. The “Sequencing platform” column indicates whether the mapper natively supports reads from a specific sequencing platform (I, Illumina; So, ABI Solid; 454, Roche 454; Sa, ABI Sanger; H, Helicos; Ion, Ion torrent; and P, PacBio), or not (N). The “Input” and “Output” columns indicate, respectively, the file formats accepted and produced by the mappers.

Input formats

: FASTA, FASTQ, CFASTA, CFASTQ, and Illumina sequence and probability files' format.

Output formats

: SAM, tab-separated values (TSV), BED file format, different versions of GFF, and number of reads mapped to genes/exons (Counts)

Table 15.2 Web-based tools for biological functional prediction

Table 15.3 Web-based tools for bisulfite treatment sequencing

Table 15.4 Public databases for epigenetic information

Chapter 16: Bioinformatics Mining of Microsatellite Markers from Genomic and Transcriptomic Sequences

Table 16.1 Comparisons of selected programs for microsatellite discovery. Plus symbols indicate degrees of user-friendliness

Chapter 17: SNP Identification from Next-Generation Sequencing Datasets

Table 17.1 Identified SNP number and rates in several teleosts with genome references

Chapter 18: SNP Array Development, Genotyping, Data Analysis, and Applications

Table 18.1 Examples of high-density arrays developed for humans and various animal species

Table 18.2 Affymetrix Axiom SNP array configurations

Table 18.3 Axiom Genotyping array reference files using the catfish 250K SNP array as an example

Table 18.4 Metrics that are used for SNP post-filtering

Table 18.5 Example of metrics file. The first 20 rows of “metrics.txt” opened in Microsoft Excel

Chapter 19: Genotyping by Sequencing and Data Analysis: RAD and 2b-RAD Sequencing

Table 19.1 Summary of available software for RAD and 2b-RAD data analysis

Chapter 20: Bioinformatics Considerations and Approaches for High-Density Linkage Mapping in Aquaculture

Table 20.1 The experimental population types supported by popular packages for the construction of linkage maps in aquaculture species. Abbreviations: DH, double haploid; HAP, haploid population; BC, backcross; RIL, recombinant inbred line; and F2, F2 intercross

Table 20.2 Some examples of linkage mapping studies in aquaculture species using various software packages

Table 20.3 A sample input data file for OneMap. A mapping family of 94 samples (hyb220M) was used for the illustration. A total of 17,401 SNPs segregated in the male was used for the linkage analysis. Here, only genotypes for five SNPs are shown. Starting from the second row, the first column contains the SNP IDs starting with the “*”; the second column contains the code indicating segregation types (e.g., B3.7 indicates ABxAB, while D2.15 indicates AA/BBxAB); and the third column contains genotypes of each SNP for the 94 individuals, separated by commas

Chapter 23: Genome-wide Association Studies of Performance Traits

Table 23.1 The relationship between the haplotype frequencies, allele frequencies, and D

Table 23.2 Examples of commonly used software packages for GWAS

Table 23.3 A comparison of association analysis with linkage analysis

Chapter 24: Gene Set Analysis of SNP Data from Genome-wide Association Studies

Table 24.1 Methods used for GSA of GWAS data

Table 24.2 A list of top 30 significant SNPs from the GWAS analysis of ADNI's AD data based on single-SNP association using the logistic model in Plink

Table 24.3 A list of top 37 significant SNP sets (

p

< 2.00E − 03) resulted from set-based tests using Plink

Table 24.4 A list of genes from the LKM-based SNP-set analysis of the ADNI's AD GWAS data (

p

< 0.001)

Table 24.5 A list of genes from the ARTP-based SNP-set analysis of the ADNI's AD GWAS data

Table 24.6 A list of pathways associated with AD identified by SNP-set analysis based on GSEA (

p

< 0.01);

p

-values are computed for the pathway ES using 10000 permutations

Table 24.7 The comparison of results from GSA methods

Chapter 25: Comparative Genomics Using CoGe, Hook, Line, and Sinker: Using CoGe Tools for Catching Fish Genome Evolution

Table 25.1 A list of fish genomes currently available in CoGe (as of March 2015)

Table 25.2 Links of fish genomes organized as notebooks in CoGe

Chapter 26: NCBI Resources Useful for Informatics Issues in Aquaculture

Table 26.1 Basic NCBI-BLAST tools

Table 26.2 A list of all the specialized BLAST tools provided by NCBI

Chapter 27: Resources and Bioinformatics Tools in Ensembl

Table 27.1 Genomes available in Ensembl as of December 2015 (Version 83), along with data and source of the “gene build” and the assembly used. Species marked by the asterisk are those genomes in the process of being annotated

Chapter 28: iAnimal: Cyberinfrastructure to Support Data-driven Science

Table 28.1 Distribution of publicly available sequencing data according to taxons. Data accessed February 2015 (http://www.ddbj.nig.ac.jp/breakdown_stats/org1000/top100-e.html)

Table 28.2 A comparison of iAnimal resources' strengths and considerations

Bioinformatics in Aquaculture

Principles and Methods

 

 

Edited by Zhanjiang (John) Liu

 

 

This edition first published 2017 © 2017 John Wiley & Sons Ltd

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

The right of Zhanjiang (John) Liu to be identified as the author of the editorial material in this work has been asserted in accordance with law.

Registered Offices

John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

Editorial Office

111 River Street, Hoboken, NJ 07030, USA

9600 Garsington Road, Oxford, OX4 2DQ, UK

The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK

Boschstr. 12, 69469 Weinheim, Germany

For details of our global editorial offices, customer services, and more information about Wiley products, visit us at www.wiley.com.

Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.

Limit of Liability/Disclaimer of Warranty

The publisher and the authors make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of fitness for a particular purpose. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for every situation. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. The fact that an organization or web site is referred to in this work as a citation and/or potential source of further information does not mean that the author or the publisher endorses the information the organization or web site may provide or recommendations it may make. Further, readers should be aware that web sites listed in this work may have changed or disappeared between when this work was written and when it is read. No warranty may be created or extended by any promotional statements for this work. Neither the publisher nor the author shall be liable for any damages arising therefrom.

Library of Congress Cataloging-in-Publication Data

Names: Liu, Zhanjiang, editor.

Title: Bioinformatics in aquaculture : principles and methods / edited by Zhanjiang (John) Liu.

Description: Hoboken, NJ : John Wiley & Sons, 2017. | Includes bibliographical references and index.

Identifiers: LCCN 2016045878 (print) | LCCN 2016057071 (ebook) | ISBN 9781118782354 (cloth : alk. paper) | ISBN 9781118782385 (Adobe PDF) | ISBN 9781118782378 (ePub)

Subjects: LCSH: Bioinformatics. | Aquaculture.

Classification: LCC QH324.2 B5488 2017 (print) | LCC QH324.2 (ebook) | DDC 572/.330285--dc23

LC record available at https://lccn.loc.gov/2016045878

Cover image: Jack fish © wildestanimal/Getty Images, Inc.;

Digital DNA strands © deliormanli/iStockphoto;

DNA illustration © enot-poloskun/iStockphoto

Cover design: Wiley

About the Editor

Zhanjiang (John) Liu is currently the associate provost and associate vice president for research at Auburn University, and a professor in the School of Fisheries, Aquaculture and Aquatic Sciences. He received his BS in 1981 from the Northwest Agricultural University (Yangling, China), and both his MS in 1985 and PhD in 1989 from the University of Minnesota (Minnesota, United States). Liu is a fellow of the American Association for the Advancement of Science (AAAS). He is presently serving as the aquaculture coordinator for the USDA National Animal Genome Project; the editor for Marine Biotechnology; associate editor for BMC Genomics; and associate editor for BMC Genetics. He has also served on the editorial board for a number of journals, including Aquaculture, Animal Biotechnology, Reviews in Aquaculture, and Frontiers of Agricultural Science and Engineering. Liu has also served in over 100 graduate committees, including as a major professor for over 50 PhD students. He has trained over 50 postdoctoral fellows and visiting scholars from all over the world. Liu has published over 300 peer-reviewed journal articles and book chapters, and this book is his fourth after Aquaculture Genome Technologies (2007), Next Generation Sequencing and Whole Genome Selection in Aquaculture (2011), and Functional Genomics in Aquaculture (2012), all published by Wiley and Blackwell.

List of Contributors

Asher Baltzell

Arizona Biological and Biomedical Sciences

University of Arizona

Tucson, Arizona

United States

Lisui Bao

The Fish Molecular Genetics and Biotechnology Laboratory

School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Zhenmin Bao

Key Lab of Marine Genetics and Breeding

College of Marine Life Science

Ocean University of China

Qingdao

China

Matt Bomhoff

The School of Plant Sciences

iPlant Collaborative

University of Arizona

Tucson, Arizona

United States

Ailu Chen

The Fish Molecular Genetics and Biotechnology Laboratory

School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Jinzhuang Dou

Key Lab of Marine Genetics and Breeding

College of Marine Life Science

Ocean University of China

Qingdao

China

Qiang Fu

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Sen Gao

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Xin Geng

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Alejandro P. Gutierrez

The Roslin Institute, and the Royal (Dick) School of Veterinary Studies

University of Edinburgh

Edinburgh

United Kingdom

Yanghua He

Department of Animal & Avian Sciences

University of Maryland

College Park, Maryland

United States

Ross D. Houston

The Roslin Institute, and the Royal (Dick) School of Veterinary Studies

University of Edinburgh

Edinburgh

United Kingdom

Chen Jiang

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Yanliang Jiang

CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Centre for Applied Aquatic Genomics

Chinese Academy of Fishery Sciences

Beijing

China

Yulin Jin

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Blake Joyce

The School of Plant Sciences, iPlant Collaborative

University of Arizona

Tucson, Arizona

United States

Mehar S. Khatkar

Faculty of Veterinary Science

University of Sydney

New South Wales

Australia

Chao Li

College of Marine Sciences and Technology

Qingdao Agricultural University

Qingdao

China

Jiongtang Li

CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Centre for Applied Aquatic Genomics

Chinese Academy of Fishery Sciences

Beijing

China

Ning Li

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Yun Li

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Shikai Liu

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Zhanjiang Liu

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Qianyun Lu

Key Lab of Marine Genetics and Breeding, College of Marine Life Science

Ocean University of China

Qingdao

China

Jia Lv

Key Lab of Marine Genetics and Breeding, College of Marine Life Science

Ocean University of China

Qingdao

China

Eric Lyons

The School of Plant Sciences, iPlant Collaborative

University of Arizona

Tucson, Arizona

United States

Fiona McCarthy

Department of Veterinary Science and Microbiology

University of Arizona

Tucson, Arizona

United States

Zhenkui Qin

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Jiuzhou Song

Department of Animal & Avian Sciences

University of Maryland

College Park, Maryland

United States

Luyang Sun

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Xiaowen Sun

CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Centre for Applied Aquatic Genomics

Chinese Academy of Fishery Sciences

Beijing

China

Suxu Tan

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Ruijia Wang

Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences

Ocean University of China

Qingdao

China

Shaolin Wang

Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Veterinary Medicine

China Agricultural University

Beijing

China

Shi Wang

Key Lab of Marine Genetics and Breeding, College of Marine Life Science

Ocean University of China

Qingdao

China

Xiaozhu Wang

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Peng Xu

CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Centre for Applied Aquatic Genomics

Chinese Academy of Fishery Sciences

Beijing

China

Yujia Yang

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Jun Yao

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Zihao Yuan

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Peng Zeng

Department of Mathematics and Statistics Auburn University

Alabama

United States

Qifan Zeng

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Jiaren Zhang

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Lingling Zhang

Key Lab of Marine Genetics and Breeding, College of Marine Life Science

Ocean University of China

Qingdao

China

Degui Zhi

School of Biomedical Informatics and School of Public Health the University of Texas Health Science Center at Houston

Texas

United States

Tao Zhou

The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences

Auburn University

Alabama

United States

Preface

Genomic sciences have made drastic advances in the last 10 years, largely because of the application of next-generation sequencing technologies. It is not just the high throughput that has revolutionized the way science is conducted; the rapidly reducing cost of sequencing has made these technologies applicable to all aspects of molecular biological research, as well as to all organisms, including aquaculture and fisheries species. About 20 years ago, Francis S. Collins, currently the director of the National Institutes of Health, had a vision of achieving the sequencing of one genome for US$1000, and we are almost there now. From the billion-dollar human genome project, to those genome projects of livestock with a budget of about US$1 million (down from US$10 million just a few years ago), to the current cost level of just tens of thousands of dollars for a de novo sequencing project, the potential for research using genomic approaches has become unlimited. Today, commercial services are available worldwide for projects, whether they are new sequencing projects for a species, or re-sequencing projects for many individuals. The key issue is to achieve a balanced of quality and quantity with minimal costs.

The rapid technological advances provide huge opportunities to apply modern genomics to enhance aquaculture production and performance traits. However, we are facing a number of new challenges, especially in the area of bioinformatics. This challenge may be paramount for aquaculture researchers and educators. Aquaculture students may be well acquainted with aquaculture, but may have no background in computer science or be sophisticated enough for bioinformatics analysis of the large datasets. The large datasets (in tera-scales) themselves pose great computational challenges. Therefore, new ways of thinking in terms of the education and training of the next generation of scientists is required. For instance, a few laboratories may be sufficient for the worldwide production of data, but several orders of magnitude more numbers of laboratories may be required for the data analysis or bioinformatics data mining required to link the data with biology. In the last several years, we have provided training with special problem-solving approaches on various bioinformatics topics. However, I find that the training of graduate students by special topics is no longer efficient enough. All graduate students in the life sciences need some levels of bioinformatics training. This book is an expansion of those training materials, and has been designed to provide the basic principles as well as hands-on experience of bioinformatics analysis. While the book is titled Bioinformatics in Aquaculture, it is not the intention of the editor or the book chapter contributors to provide bioinformatics guidance on topics such as programming. Rather, the focus is on providing a basic framework about the need for informatics analysis, and then to provide guidance on the practical applications of existing bioinformatics tools for aquaculture problems.

This book has 28 chapters, arranged in five parts. Part 1 focuses on issues of dealing with DNA sequences: basic command lines (Chapter 1); how to determine sequence identities (