186,99 €
Bioinformatics derives knowledge from computer analysis of biological data. In particular, genomic and transcriptomic datasets are processed, analysed and, whenever possible, associated with experimental results from various sources, to draw structural, organizational, and functional information relevant to biology. Research in bioinformatics includes method development for storage, retrieval, and analysis of the data.
Bioinformatics in Aquaculture provides the most up to date reviews of next generation sequencing technologies, their applications in aquaculture, and principles and methodologies for the analysis of genomic and transcriptomic large datasets using bioinformatic methods, algorithm, and databases. The book is unique in providing guidance for the best software packages suitable for various analysis, providing detailed examples of using bioinformatic software and command lines in the context of real world experiments.
This book is a vital tool for all those working in genomics, molecular biology, biochemistry and genetics related to aquaculture, and computational and biological sciences.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 1188
Veröffentlichungsjahr: 2017
Cover
Title Page
Copyright
About the Editor
List of Contributors
Preface
Part I: Bioinformatics Analysis of Genomic Sequences
Chapter 1: Introduction to Linux and Command Line Tools for Bioinformatics
Introduction
Overview of Linux
Directories, Files, and Processes
Environment Variables
Basic Linux Commands
Installing Software Packages
Accessing a Remote Linux Supercomputer System
Demonstration of Command Lines
Further Reading
Chapter 2: Determining Sequence Identities: BLAST, Phylogenetic Analysis, and Syntenic Analyses
Introduction
Determining Sequence Identities through BLAST Searches
Web-based BLAST
UNIX-based BLAST
Determining Sequence Identities through Phylogenetic Analysis
Procedures of Phylogenetic Analysis
Determining Sequence Identities through Synthetic Analysis
References
Chapter 3: Next-Generation Sequencing Technologies and the Assembly of Short Reads into Reference Genome Sequences
Introduction
Understanding of DNA Sequencing Technologies
Preprocessing of Sequences
Sequence Assembly
Scaffolding
Gap Filling (Gap Closing)
Evaluation of Assembly Quality
References
Chapter 4: Genome Annotation: Determination of the Coding Potential of the Genome
Introduction
Methods Used in Gene Prediction
Case Study: Genome Annotation Examples: Gene Annotation of Chromosome 1 of Zebrafish using FGENESH and AUGUSTUS
Discussion
References
Chapter 5: Analysis of Repetitive Elements in the Genome
Introduction
Methods Used in Repeat Analysis
Software for Repeat Identification
Using the Command-line Version of RepeatModeler to Identify Repetitive Elements in Genomic Sequences
References
Chapter 6: Analysis of Duplicated Genes and Multi-Gene Families
Introduction
Pipeline Installations
Identification of Duplicated Genes and Multi-Member Gene Family
Downstream Analysis
Perspectives
References
Chapter 7: Dealing with Complex Polyploidy Genomes: Considerations for Assembly and Annotation of the Common Carp Genome
Introduction
Properties of the Common Carp Genome
Genome Assembly: Strategies for Reducing Problems Caused by Allotetraploidy
Annotation of Tetraploidy Genome
Conclusions
References
Part II: Bioinformatics Analysis of Transcriptomic Sequences
Chapter 8: Assembly of RNA-Seq Short Reads into Transcriptome Sequences
Introduction
RNA-Seq Procedures
Reference-Guided Transcriptome Assembly
De novo
Transcriptome Assembly
Assessment of RNA-Seq Assembly
Conclusions
Acknowledgments
References
Chapter 9: Analysis of Differentially Expressed Genes and Co-expressed Genes Using RNA-Seq Datasets
Introduction
Analysis of Differentially Expressed Genes Using CLC Genomics Workbench
Analysis of Differentially Expressed Genes Using Trinity
Analysis of Co-Expressed Genes
Computational Challenges
Acknowledgments
References
Chapter 10: Gene Ontology, Enrichment Analysis, and Pathway Analysis
Introduction
GO and the GO Project
Enrichment Analysis
Gene Pathway Analysis
References
Chapter 11: Genetic Analysis Using RNA-Seq: Bulk Segregant RNA-Seq
Introduction
BSR-Seq: Basic Considerations
BSR-Seq Procedures
Acknowledgments
References
Chapter 12: Analysis of Long Non-coding RNAs
Introduction
Data Required for the Analysis of lncRNAs
Assembly of RNA-Seq Sequences
Identification of lncRNAs
Analysis of lncRNA Expression
Analysis and Prediction of lncRNA Functions
Future Perspectives
References
Chapter 13: Analysis of MicroRNAs and Their Target Genes
Introduction
miRNA Biogenesis and Function
Tools for miRNA Data Analysis
miRNA Analysis: Computational Identification from Genome Sequences
miRNA Analysis: Empirical Identification by Small RNA-Seq
Prediction of miRNA Targets
Conclusions
References
Chapter 14: Analysis of Allele-Specific Expression
Introduction
Genome-wide Approaches for ASE Analysis
Applications of ASE Analysis
Considerations of ASE Analysis by RNA-Seq
Step-by-Step Illustration of ASE Analysis by RNA-Seq
References
Chapter 15: Bioinformatics Analysis of Epigenetics
Introduction
Mastering Epigenetic Data
Histone Modifications
Genomic Data Manipulation
DNA Methylation and Bioinformatics
Histone Modifications and Bioinformatics
Perspectives
References
Part III: Bioinformatics Mining and Genetic Analysis of DNA Markers
Chapter 16: Bioinformatics Mining of Microsatellite Markers from Genomic and Transcriptomic Sequences
Introduction
Bioinformatics Mining of Microsatellite Markers
Primer Design for Microsatellite Markers
Conclusions
References
Chapter 17: SNP Identification from Next-Generation Sequencing Datasets
Introduction
SNP Identification and Analysis
Detailed Protocols of SNP Identification
References
Chapter 18: SNP Array Development, Genotyping, Data Analysis, and Applications
Introduction
Development of High-density SNP Array
SNP Genotyping: Biochemistry and Workflow
SNP Genotyping: Analysis of Axiom Genotyping Array Data
SNP Analysis After Genotype Calling
Applications of SNP Arrays
Conclusion
Further Readings
References
Chapter 19: Genotyping by Sequencing and Data Analysis: RAD and 2b-RAD Sequencing
Introduction
Methodology Principles
The Experimental Procedure of 2b-RAD
Bioinformatics Analysis of RAD and 2b-RAD Data
Example for Running a Linkage Mapping Analysis
The Benefits and Pitfalls of RAD and 2b-RAD Applications
References
Chapter 20: Bioinformatics Considerations and Approaches for High-Density Linkage Mapping in Aquaculture
Introduction
Basic Concepts
Requirements for Genetic Mapping
Linkage Mapping Process
Step-by-Step Illustration of Linkage Mapping
Pros and Cons of Linkage Mapping Software Packages
References
Chapter 21: Genomic Selection in Aquaculture Breeding Programs
Introduction
Genomic Selection
Steps in GS
Models for Genomic Prediction
An Example of Implementation of Genomic Prediction
Some Important Considerations for GS
GS in Aquaculture
Acknowledgment
References
Chapter 22: Quantitative Trait Locus Mapping in Aquaculture Species: Principles and Practice
Introduction
DNA Markers and Genotyping
Linkage Maps
Quantitative Trait Loci (QTL) Mapping
References
Chapter 23: Genome-wide Association Studies of Performance Traits
Introduction
Study Population
Phenotype Design
Power of Association Test and Sample Size
Quality Control Procedures
LD Analysis
Association Test
Significance Level for Multiple Testing
Step-by-Step Procedures: A Case Study in Catfish
Follow-up Work after GWAS
Pitfalls of GWAS with Aquaculture Species
Comparison of GWAS with Alternative Designs
Conclusions
References
Chapter 24: Gene Set Analysis of SNP Data from Genome-wide Association Studies
Introduction
GSA in GWAS
Statistical Methods
Demonstration Using Alzheimer's Disease Neuroimaging Initiative's AD Data
Conclusion
References
Part IV: Comparative Genome Analysis
Chapter 25: Comparative Genomics Using CoGe, Hook, Line, and Sinker: Using CoGe Tools for Catching Fish Genome Evolution
Introduction
Getting Hooked into the CoGe Platform
Casting the Line: Analyses for Comparing Genomes
Sinkers to Cast Further and Deeper: Adding Weight to Genomes with Additional Data Types
Conclusions
References
Part V: Bioinformatics Resources, Databases, and Genome Browsers
Chapter 26: NCBI Resources Useful for Informatics Issues in Aquaculture
Introduction
Popularly Used Databases in NCBI
Popularly Used Tools in NCBI
Submit Data to NCBI
References
Chapter 27: Resources and Bioinformatics Tools in Ensembl
Introduction
Ensembl Resources
Comparative Genomics
Ensembl Regulation
Ensembl Tools
Assembly Converter
ID History Converter
BioMart
References
Chapter 28: iAnimal: Cyberinfrastructure to Support Data-driven Science
Introduction
Background
iAnimal Resources
The Data Store
DE
Atmosphere: Accessible Cloud Computing for Researchers
The Agave API
Bio-Image Semantic Query User Environment: Analysis of Images
Selecting the Right Tool for the Job
Reaching Across the Aisle: Federated Third-Party Platforms
AgBase
VCmap
CoGe
How to Find Help Using iAnimal
Coming Soon to CyVerse
Conclusion
Acknowledgments
References
Index
End User License Agreement
xxiii
xxv
xxvi
xxvii
xxviii
xxix
xxxi
xxxii
xxxiii
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
123
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
275
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
461
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
489
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
547
548
549
550
551
552
553
554
555
556
557
Cover
Table of Contents
Preface
Begin Reading
Chapter 1: Introduction to Linux and Command Line Tools for Bioinformatics
Table 1.1 The options of chmod command
Table 1.2 List of octal numbers for file permissions
Table 1.3 A list of examples of environment variables
Table 1.4 A list of frequently used tar options
Chapter 2: Determining Sequence Identities: BLAST, Phylogenetic Analysis, and Syntenic Analyses
Table 2.1 Basic types of BLAST
Table 2.2 Commonly used multiple sequence alignment software
Table 2.3 The phylogenetic tree construction method
Table 2.4 Commonly used synteny analysis software and their characteristics
Chapter 4: Genome Annotation: Determination of the Coding Potential of the Genome
Table 4.1 A list of gene prediction software packages, modified from http://en.wikipedia.org/wiki/List_of_gene_prediction_software
Table 4.2 Accuracy of annotation of
C. elegans
using various gene-prediction programs. Accuracy values are from Coghlan
et al
. (2008), including sensitivity (Sn) and specificity (Sp) for the prediction of base, exon, transcript, and gene
Chapter 5: Analysis of Repetitive Elements in the Genome
Table 5.1 Search tools used for STR detection. Modified from Merkel and Gemmell (2008)
Table 5.2 Summary of programs for finding interspersed repeats. Modified from Saha
et al
. (2008b)
Chapter 8: Assembly of RNA-Seq Short Reads into Transcriptome Sequences
Table 8.1 Examples of fish species whose reference genome sequences are available
Chapter 9: Analysis of Differentially Expressed Genes and Co-expressed Genes Using RNA-Seq Datasets
Table 9.1 Summary of the comparisons between microarray and RNA-Seq
Chapter 12: Analysis of Long Non-coding RNAs
Table 12.1 The availability of RNA-Seq datasets from fish and aquaculture species in the SRA database as of December 2014
Chapter 13: Analysis of MicroRNAs and Their Target Genes
Table 13.1 A list of tools for
in-silico
miRNA prediction
Table 13.2 A list of tools for miRNA analysis of deep sequencing data
Table 13.3 List of tools for prediction of RNA secondary structures
Table 13.4 List of tools for prediction of miRNA targets
Table 13.5 A list of tools for miRNA–mRNA integrated analysis
Table 13.6 Options of MapMi (version 1.5.9-Build32)
Table 13.7 Example of MapMi output. The example is one miRNA identification extracted from this output of miRNA identification from the channel catfish genome (Ipg1.fa) using all vertebrate mature miRNAs as queries
Table 13.8 A list of major scripts and options in the miRDeep2 package
Table 13.9 A list of miRanda options for miRNA target prediction
Chapter 14: Analysis of Allele-Specific Expression
Table 14.1 Example of
cis
- and
trans
-regulatory analysis. ID is gene ID; PA1 is the allele count in parent 1; PA2 is the allele count in parent 2; F1A1 is the count of the first allele in F1; F1A2 is the count of the second allele in F1
Table 14.2 Example of genes showing imprinting patterns. ID is gene ID; Hyb_reads1: reads count for allele 1 in F1 hybrid; Hyb_reads2: reads count for allele 2 in F1 hybrid; Rec_Reads1: reads count of allele 1 in reciprocal hybrid; Rec_Reads2: reads count of allele 2 in reciprocal hybrid; P1 is the expression percentage from the maternal allele in F1 hybrid; and P2 is the expression percentage from the paternal allele in reciprocal hybrid
Chapter 15: Bioinformatics Analysis of Epigenetics
Table 15.1 Tools for mapping high-throughput sequencing data. The “Sequencing platform” column indicates whether the mapper natively supports reads from a specific sequencing platform (I, Illumina; So, ABI Solid; 454, Roche 454; Sa, ABI Sanger; H, Helicos; Ion, Ion torrent; and P, PacBio), or not (N). The “Input” and “Output” columns indicate, respectively, the file formats accepted and produced by the mappers.
Input formats
: FASTA, FASTQ, CFASTA, CFASTQ, and Illumina sequence and probability files' format.
Output formats
: SAM, tab-separated values (TSV), BED file format, different versions of GFF, and number of reads mapped to genes/exons (Counts)
Table 15.2 Web-based tools for biological functional prediction
Table 15.3 Web-based tools for bisulfite treatment sequencing
Table 15.4 Public databases for epigenetic information
Chapter 16: Bioinformatics Mining of Microsatellite Markers from Genomic and Transcriptomic Sequences
Table 16.1 Comparisons of selected programs for microsatellite discovery. Plus symbols indicate degrees of user-friendliness
Chapter 17: SNP Identification from Next-Generation Sequencing Datasets
Table 17.1 Identified SNP number and rates in several teleosts with genome references
Chapter 18: SNP Array Development, Genotyping, Data Analysis, and Applications
Table 18.1 Examples of high-density arrays developed for humans and various animal species
Table 18.2 Affymetrix Axiom SNP array configurations
Table 18.3 Axiom Genotyping array reference files using the catfish 250K SNP array as an example
Table 18.4 Metrics that are used for SNP post-filtering
Table 18.5 Example of metrics file. The first 20 rows of “metrics.txt” opened in Microsoft Excel
Chapter 19: Genotyping by Sequencing and Data Analysis: RAD and 2b-RAD Sequencing
Table 19.1 Summary of available software for RAD and 2b-RAD data analysis
Chapter 20: Bioinformatics Considerations and Approaches for High-Density Linkage Mapping in Aquaculture
Table 20.1 The experimental population types supported by popular packages for the construction of linkage maps in aquaculture species. Abbreviations: DH, double haploid; HAP, haploid population; BC, backcross; RIL, recombinant inbred line; and F2, F2 intercross
Table 20.2 Some examples of linkage mapping studies in aquaculture species using various software packages
Table 20.3 A sample input data file for OneMap. A mapping family of 94 samples (hyb220M) was used for the illustration. A total of 17,401 SNPs segregated in the male was used for the linkage analysis. Here, only genotypes for five SNPs are shown. Starting from the second row, the first column contains the SNP IDs starting with the “*”; the second column contains the code indicating segregation types (e.g., B3.7 indicates ABxAB, while D2.15 indicates AA/BBxAB); and the third column contains genotypes of each SNP for the 94 individuals, separated by commas
Chapter 23: Genome-wide Association Studies of Performance Traits
Table 23.1 The relationship between the haplotype frequencies, allele frequencies, and D
Table 23.2 Examples of commonly used software packages for GWAS
Table 23.3 A comparison of association analysis with linkage analysis
Chapter 24: Gene Set Analysis of SNP Data from Genome-wide Association Studies
Table 24.1 Methods used for GSA of GWAS data
Table 24.2 A list of top 30 significant SNPs from the GWAS analysis of ADNI's AD data based on single-SNP association using the logistic model in Plink
Table 24.3 A list of top 37 significant SNP sets (
p
< 2.00E − 03) resulted from set-based tests using Plink
Table 24.4 A list of genes from the LKM-based SNP-set analysis of the ADNI's AD GWAS data (
p
< 0.001)
Table 24.5 A list of genes from the ARTP-based SNP-set analysis of the ADNI's AD GWAS data
Table 24.6 A list of pathways associated with AD identified by SNP-set analysis based on GSEA (
p
< 0.01);
p
-values are computed for the pathway ES using 10000 permutations
Table 24.7 The comparison of results from GSA methods
Chapter 25: Comparative Genomics Using CoGe, Hook, Line, and Sinker: Using CoGe Tools for Catching Fish Genome Evolution
Table 25.1 A list of fish genomes currently available in CoGe (as of March 2015)
Table 25.2 Links of fish genomes organized as notebooks in CoGe
Chapter 26: NCBI Resources Useful for Informatics Issues in Aquaculture
Table 26.1 Basic NCBI-BLAST tools
Table 26.2 A list of all the specialized BLAST tools provided by NCBI
Chapter 27: Resources and Bioinformatics Tools in Ensembl
Table 27.1 Genomes available in Ensembl as of December 2015 (Version 83), along with data and source of the “gene build” and the assembly used. Species marked by the asterisk are those genomes in the process of being annotated
Chapter 28: iAnimal: Cyberinfrastructure to Support Data-driven Science
Table 28.1 Distribution of publicly available sequencing data according to taxons. Data accessed February 2015 (http://www.ddbj.nig.ac.jp/breakdown_stats/org1000/top100-e.html)
Table 28.2 A comparison of iAnimal resources' strengths and considerations
Edited by Zhanjiang (John) Liu
This edition first published 2017 © 2017 John Wiley & Sons Ltd
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
The right of Zhanjiang (John) Liu to be identified as the author of the editorial material in this work has been asserted in accordance with law.
Registered Offices
John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Editorial Office
111 River Street, Hoboken, NJ 07030, USA
9600 Garsington Road, Oxford, OX4 2DQ, UK
The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, UK
Boschstr. 12, 69469 Weinheim, Germany
For details of our global editorial offices, customer services, and more information about Wiley products, visit us at www.wiley.com.
Wiley also publishes its books in a variety of electronic formats and by print-on-demand. Some content that appears in standard print versions of this book may not be available in other formats.
Limit of Liability/Disclaimer of Warranty
The publisher and the authors make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of fitness for a particular purpose. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for every situation. In view of ongoing research, equipment modifications, changes in governmental regulations, and the constant flow of information relating to the use of experimental reagents, equipment, and devices, the reader is urged to review and evaluate the information provided in the package insert or instructions for each chemical, piece of equipment, reagent, or device for, among other things, any changes in the instructions or indication of usage and for added warnings and precautions. The fact that an organization or web site is referred to in this work as a citation and/or potential source of further information does not mean that the author or the publisher endorses the information the organization or web site may provide or recommendations it may make. Further, readers should be aware that web sites listed in this work may have changed or disappeared between when this work was written and when it is read. No warranty may be created or extended by any promotional statements for this work. Neither the publisher nor the author shall be liable for any damages arising therefrom.
Library of Congress Cataloging-in-Publication Data
Names: Liu, Zhanjiang, editor.
Title: Bioinformatics in aquaculture : principles and methods / edited by Zhanjiang (John) Liu.
Description: Hoboken, NJ : John Wiley & Sons, 2017. | Includes bibliographical references and index.
Identifiers: LCCN 2016045878 (print) | LCCN 2016057071 (ebook) | ISBN 9781118782354 (cloth : alk. paper) | ISBN 9781118782385 (Adobe PDF) | ISBN 9781118782378 (ePub)
Subjects: LCSH: Bioinformatics. | Aquaculture.
Classification: LCC QH324.2 B5488 2017 (print) | LCC QH324.2 (ebook) | DDC 572/.330285--dc23
LC record available at https://lccn.loc.gov/2016045878
Cover image: Jack fish © wildestanimal/Getty Images, Inc.;
Digital DNA strands © deliormanli/iStockphoto;
DNA illustration © enot-poloskun/iStockphoto
Cover design: Wiley
Zhanjiang (John) Liu is currently the associate provost and associate vice president for research at Auburn University, and a professor in the School of Fisheries, Aquaculture and Aquatic Sciences. He received his BS in 1981 from the Northwest Agricultural University (Yangling, China), and both his MS in 1985 and PhD in 1989 from the University of Minnesota (Minnesota, United States). Liu is a fellow of the American Association for the Advancement of Science (AAAS). He is presently serving as the aquaculture coordinator for the USDA National Animal Genome Project; the editor for Marine Biotechnology; associate editor for BMC Genomics; and associate editor for BMC Genetics. He has also served on the editorial board for a number of journals, including Aquaculture, Animal Biotechnology, Reviews in Aquaculture, and Frontiers of Agricultural Science and Engineering. Liu has also served in over 100 graduate committees, including as a major professor for over 50 PhD students. He has trained over 50 postdoctoral fellows and visiting scholars from all over the world. Liu has published over 300 peer-reviewed journal articles and book chapters, and this book is his fourth after Aquaculture Genome Technologies (2007), Next Generation Sequencing and Whole Genome Selection in Aquaculture (2011), and Functional Genomics in Aquaculture (2012), all published by Wiley and Blackwell.
Asher Baltzell
Arizona Biological and Biomedical Sciences
University of Arizona
Tucson, Arizona
United States
Lisui Bao
The Fish Molecular Genetics and Biotechnology Laboratory
School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Zhenmin Bao
Key Lab of Marine Genetics and Breeding
College of Marine Life Science
Ocean University of China
Qingdao
China
Matt Bomhoff
The School of Plant Sciences
iPlant Collaborative
University of Arizona
Tucson, Arizona
United States
Ailu Chen
The Fish Molecular Genetics and Biotechnology Laboratory
School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Jinzhuang Dou
Key Lab of Marine Genetics and Breeding
College of Marine Life Science
Ocean University of China
Qingdao
China
Qiang Fu
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Sen Gao
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Xin Geng
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Alejandro P. Gutierrez
The Roslin Institute, and the Royal (Dick) School of Veterinary Studies
University of Edinburgh
Edinburgh
United Kingdom
Yanghua He
Department of Animal & Avian Sciences
University of Maryland
College Park, Maryland
United States
Ross D. Houston
The Roslin Institute, and the Royal (Dick) School of Veterinary Studies
University of Edinburgh
Edinburgh
United Kingdom
Chen Jiang
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Yanliang Jiang
CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Centre for Applied Aquatic Genomics
Chinese Academy of Fishery Sciences
Beijing
China
Yulin Jin
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Blake Joyce
The School of Plant Sciences, iPlant Collaborative
University of Arizona
Tucson, Arizona
United States
Mehar S. Khatkar
Faculty of Veterinary Science
University of Sydney
New South Wales
Australia
Chao Li
College of Marine Sciences and Technology
Qingdao Agricultural University
Qingdao
China
Jiongtang Li
CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Centre for Applied Aquatic Genomics
Chinese Academy of Fishery Sciences
Beijing
China
Ning Li
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Yun Li
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Shikai Liu
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Zhanjiang Liu
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Qianyun Lu
Key Lab of Marine Genetics and Breeding, College of Marine Life Science
Ocean University of China
Qingdao
China
Jia Lv
Key Lab of Marine Genetics and Breeding, College of Marine Life Science
Ocean University of China
Qingdao
China
Eric Lyons
The School of Plant Sciences, iPlant Collaborative
University of Arizona
Tucson, Arizona
United States
Fiona McCarthy
Department of Veterinary Science and Microbiology
University of Arizona
Tucson, Arizona
United States
Zhenkui Qin
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Jiuzhou Song
Department of Animal & Avian Sciences
University of Maryland
College Park, Maryland
United States
Luyang Sun
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Xiaowen Sun
CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Centre for Applied Aquatic Genomics
Chinese Academy of Fishery Sciences
Beijing
China
Suxu Tan
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Ruijia Wang
Ministry of Education Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences
Ocean University of China
Qingdao
China
Shaolin Wang
Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Veterinary Medicine
China Agricultural University
Beijing
China
Shi Wang
Key Lab of Marine Genetics and Breeding, College of Marine Life Science
Ocean University of China
Qingdao
China
Xiaozhu Wang
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Peng Xu
CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Centre for Applied Aquatic Genomics
Chinese Academy of Fishery Sciences
Beijing
China
Yujia Yang
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Jun Yao
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Zihao Yuan
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Peng Zeng
Department of Mathematics and Statistics Auburn University
Alabama
United States
Qifan Zeng
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Jiaren Zhang
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Lingling Zhang
Key Lab of Marine Genetics and Breeding, College of Marine Life Science
Ocean University of China
Qingdao
China
Degui Zhi
School of Biomedical Informatics and School of Public Health the University of Texas Health Science Center at Houston
Texas
United States
Tao Zhou
The Fish Molecular Genetics and Biotechnology Laboratory, School of Fisheries, Aquaculture and Aquatic Sciences and Program of Cell and Molecular Biosciences
Auburn University
Alabama
United States
Genomic sciences have made drastic advances in the last 10 years, largely because of the application of next-generation sequencing technologies. It is not just the high throughput that has revolutionized the way science is conducted; the rapidly reducing cost of sequencing has made these technologies applicable to all aspects of molecular biological research, as well as to all organisms, including aquaculture and fisheries species. About 20 years ago, Francis S. Collins, currently the director of the National Institutes of Health, had a vision of achieving the sequencing of one genome for US$1000, and we are almost there now. From the billion-dollar human genome project, to those genome projects of livestock with a budget of about US$1 million (down from US$10 million just a few years ago), to the current cost level of just tens of thousands of dollars for a de novo sequencing project, the potential for research using genomic approaches has become unlimited. Today, commercial services are available worldwide for projects, whether they are new sequencing projects for a species, or re-sequencing projects for many individuals. The key issue is to achieve a balanced of quality and quantity with minimal costs.
The rapid technological advances provide huge opportunities to apply modern genomics to enhance aquaculture production and performance traits. However, we are facing a number of new challenges, especially in the area of bioinformatics. This challenge may be paramount for aquaculture researchers and educators. Aquaculture students may be well acquainted with aquaculture, but may have no background in computer science or be sophisticated enough for bioinformatics analysis of the large datasets. The large datasets (in tera-scales) themselves pose great computational challenges. Therefore, new ways of thinking in terms of the education and training of the next generation of scientists is required. For instance, a few laboratories may be sufficient for the worldwide production of data, but several orders of magnitude more numbers of laboratories may be required for the data analysis or bioinformatics data mining required to link the data with biology. In the last several years, we have provided training with special problem-solving approaches on various bioinformatics topics. However, I find that the training of graduate students by special topics is no longer efficient enough. All graduate students in the life sciences need some levels of bioinformatics training. This book is an expansion of those training materials, and has been designed to provide the basic principles as well as hands-on experience of bioinformatics analysis. While the book is titled Bioinformatics in Aquaculture, it is not the intention of the editor or the book chapter contributors to provide bioinformatics guidance on topics such as programming. Rather, the focus is on providing a basic framework about the need for informatics analysis, and then to provide guidance on the practical applications of existing bioinformatics tools for aquaculture problems.
This book has 28 chapters, arranged in five parts. Part 1 focuses on issues of dealing with DNA sequences: basic command lines (Chapter 1); how to determine sequence identities (
