216,99 €
PRIVACY PRESERVATION of GENOMIC and MEDICAL DATA Discusses topics concerning the privacy preservation of genomic data in the digital era, including data security, data standards, and privacy laws so that researchers in biomedical informatics, computer privacy and ELSI can assess the latest advances in privacy-preserving techniques for the protection of human genomic data. Privacy Preservation of Genomic and Medical Data focuses on genomic data sources, analytical tools, and the importance of privacy preservation. Topics discussed include tensor flow and Bio-Weka, privacy laws, HIPAA, and other emerging technologies like Internet of Things, IoT-based cloud environments, cloud computing, edge computing, and blockchain technology for smart applications. The book starts with an introduction to genomes, genomics, genetics, transcriptomes, proteomes, and other basic concepts of modern molecular biology. DNA sequencing methodology, DNA-binding proteins, and other related terms concerning genomes and genetics, and the privacy issues are discussed in detail. The book also focuses on genomic data sources, analyzing tools, and the importance of privacy preservation. It concludes with future predictions for genomic and genomic privacy, emerging technologies, and applications. Audience Researchers in information technology, data mining, health informatics and health technologies, clinical informatics, bioinformatics, security and privacy in healthcare, as well as health policy developers in public and private health departments and public health.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 822
Veröffentlichungsjahr: 2023
Cover
Series Page
Title Page
Copyright Page
Preface
Acknowledgements
Part 1: Fundamentals
1 Introduction to Genomics and Genetics
1.1 Introduction
1.2 Hub of Genomics
1.3 Genome Sequencing Methods
1.4 Variation of Genome Sequencing
1.5 Diseases and Disorders
1.6 Future Prospects
1.7 Conclusion
References
2 An Overview of Genomics and Frontiers in Genetics for Smart Era
2.1 Introduction
2.2 Application of Genomes—The Frontiers in Genetics
2.3 Genomics in Military
2.4 Genomics in Medicine
2.5 International Projects
2.6 Case Study
2.7 Conclusion
References
3 Technical Trends in Public Healthcare and Medical Engineering
3.1 Introduction
3.2 Background Work
3.3 Current Scenario of Public Healthcare System and Medical Engineering
3.4 Role of AI in Healthcare
3.5 Technological Analysis for Healthcare System and Medical Engineering
3.6 Future Aspects of AI in Healthcare and Medical Engineering
3.7 Conclusion
3.8 Future Aspect
References
4 Role of Genomics in Smart Era and Its Application in COVID-19
4.1 Introduction
4.2 Basics of Genomics
4.3 Evolution of Genomics
4.4 Characteristics of Genomics and DNA Computing
4.5 Types of Genomics
4.6 Application Area of Genomics
4.7 Application of DNA Computing
4.8 Genomics and DNA Computing in COVID-19 Epidemic
4.9 Issues in Genomics and DNA Computing
4.10 Tools and Technology Used in Genomics Systems
4.11 Role of AI Technology Used in Genomics Systems
4.12 Related Works and Challenges
4.13 Future Research Dimension
4.14 Conclusion
References
Part 2: Methods and Applications
5 Novel Cutting-Edge Security Tools for Medical and Genomic Data With Privacy Preservation Techniques
5.1 Introduction
5.2 Background of Genomic
5.3 Literature Review
5.4 Highlights of the Proposed Methodologies
5.5 Results and Discussion
5.6 Conclusion
References
6 Genomic Data Analysis With Optimized Convolutional Neural Network (CNN) for Edge Applications
6.1 Introduction
6.2 Related Work
6.3 Proposed Methodology
6.4 Conclusion
References
7 Real-World Estimation of Malaria Prevalence From Genome of Vectors and Climate Analysis
7.1 Introduction
7.2 Significance of Estimation of Malaria Prevalence From Genome of Vectors and Climate Analysis
7.3 Proposed Methodology
7.4 Conclusion
References
8 Revolutionalizing Internet of Medical Things for Blockchain-Based 5G Healthcare Security and Privacy for Genomic Data
8.1 Introduction
8.2 Internet of Things in Healthcare and Medical Applications (IoMT)
8.3 Limitations of Existing Technologies in Healthcare Applications
8.4 Solving Healthcare Problems Using BCT
8.5 Proposed Model for 5G Healthcare
8.6 Experimental Setup and Discussion
8.7 Results
8.8 Conclusion
References
9 Preserve Privacy-HD: A Privacy-Preserving Distributed Framework for Health Data
9.1 Introduction
9.2 Organization of Chapter
9.3 Related Work
9.4 Proposed Methodology
9.5 Complexity Analysis
9.6 Conclusion
References
Part 3: Future-Based Applications
10 Decision and Recommendation System Services for Patient Using Artificial Intelligence
10.1 Introduction
10.2 Literature Review
10.3 Proposed Methodology
10.4 Implementation and Results Analysis
10.5 Results and Discussion
10.6 Conclusion
References
11 MPHDRDNN: Meticulous Presaging of Heart Disease by Regularized DNN Through GUI
11.1 Introduction
11.2 Literature Study
11.3 Proposed Methodology
11.4 Performance Evaluation Metrics
11.5 Results and Implementation Process for Building a GUI
11.6 Conclusion and Future Scope
Acknowledgment
References
12 Techniques for Removing Hair From Dermoscopic Images: A Survey of Current Approaches
12.1 Introduction
12.2 Inpainting Techniques Used for Hair Removal From Dermoscopic Images
12.3 Hair Removal Approaches From Dermoscopic Images
12.4 Results and Discussion
12.5 Conclusion
Acknowledgment
References
13 The Emergence of Blockchain Technology in Industrial Revolution 5.0
13.1 Introduction
13.2 Literature Survey
13.3 Evolution of Web Technology in Association With Blockchain
13.4 Understanding of Basic Key Terminologies
13.5 Industrial Components Associated With Blockchain
13.6 Contribution of Blockchain for Revolutionizing Industrial Aspects
13.7 Transformation of Industrial Sectors by Blockchain
13.8 Economical Impact Analysis of Blockchain Over Each Industry
13.9 Conclusion
References
14 Cervical Cancer Detection Using Big Data Analytics and Their Comparative Analysis
14.1 Introduction
14.2 Related Literature
14.3 Machine Learning Algorithms Used
14.4 Description of the Dataset
14.5 Parameter Specification on the ANNs
14.6 Analysis of Results
14.7 Conclusions
References
15 Smart Walking Stick for Visually Impaired People
15.1 Introduction
15.2 Related Work
15.3 System Architecture
15.4 Conclusion
References
Part 4: Issues and Challenges
16 Enhanced Security Measures in Genomic Data Management
16.1 Introduction
16.2 Literature Survey
16.3 Background
16.4 Proposed Model
16.5 Analysis of the Work
16.6 Future Work
16.7 Conclusion
References
17 Industry 5.0: Potentials, Issues, Opportunities, and Challenges for Society 5.0
17.1 Introduction
17.2 Literature Review
17.3 Role of Robots in Industry 5.0 and Society 5.0
17.4 Potentials of Industry 5.0 and Society 5.0
17.5 Open Issues Toward Industry 5.0 and Society 5.0
17.6 Opportunities and Challenges Toward Industry 5.0 and Society 5.0
17.7 Conclusion
References
18 Artificial Intelligence—Blockchain Enabled Technology for Internet of Things: Research Statements, Open Issues, and Possible Applications in the Near Future
18.1 Introduction: Artificial Intelligence, Machine Learning, Internet of Things, and Blockchain Concepts
18.2 Frameworks, Architectures, and Models for the Convergence of Machine Learning, IoTs, and Blockchain Technologies
18.3 Machine Learning Techniques for the Optimization of IoT-Based Services
18.4 Machine Learning Techniques for Exchanging Data in a Blockchain
18.5 Machine Learning-Based Blockchain Transactions
18.6 IoT-Enabled Security Using Artificial Intelligence and Blockchain Technologies
18.7 Security, Privacy, and Trust Related Areas in Artificial Intelligence and Blockchain-Based IoT Applications
18.8 Blockchain-Based Learning Automated Analytics Platforms
18.9 Blockchain- and Machine Learning-Based Solutions for Big Data Challenges
18.10 Machine Learning Techniques for the Analysis of Sensor Records for Healthcare Applications
18.11 Blockchain-Enabled IoT Platforms for Automation in Intelligent Transportation Systems
18.12 Conclusion
References
19 Blockchain-Empowered Decentralized Applications: Current Trends and Challenges
19.1 Introduction
19.2 Literature Survey
19.3 Key Characteristics of Blockchain
19.4 Challenges Faced by Blockchain Technology
19.5 The Use of Smart Contracts in Decentralized Autonomous Organizations
19.6 Smart Contracts: An Overview
19.7 Smart Contract Platforms
19.8 Applications of Blockchain Technology
19.9 Future Data Storage Issues in Blockchain and Its Solution
19.10 Conclusion
References
20 Privacy of Data, Privacy Laws, and Privacy by Design
20.1 Introduction
20.2 Privacy Issues
20.3 Privacy Laws and Regulatıons
20.4 Techniques for Enforcing Privacy
20.5 Privacy by Design
20.6 Conclusion
References
Index
End User License Agreement
Chapter 4
Table 4.1 Characteristics and relative performance by a previous method on COV...
Chapter 7
Table 7.1 Classifiers for the given problem statement.
Table 7.2 Nonlinear regression for the given problem statement.
Chapter 9
Table 9.1 Existing works.
Chapter 10
Table 10.1 Description of one-hot encoding transform.
Table 10.2 Examined datasets.
Table 10.3 Comparison of LSTM over SVM and CNN.
Table 10.4 Comparisons of all metrics.
Chapter 11
Table 11.1 Nodes presents in DNN.
Table 11.2 Accuracy results.
Table 11.3 Comparison of the Reg-DNN with existing systems.
Chapter 12
Table 12.1 Comparison of filtering-based techniques for hair removal in dermos...
Table 12.2 Comparison of morphological techniques for hair removal in dermosco...
Table 12.3 An overview of edge-based approaches for hair removal in dermoscopy...
Table 12.4 Comparison of supervised techniques for hair removal in dermoscopy.
Table 12.5 An overview of various hair removal techniques in dermoscopy.
Chapter 13
Table 13.1 Summarizing literature work.
Table 13.2 Comparison of functionality and mining processes in different conse...
Chapter 14
Table 14.1 Dataset attributes.
Table 14.2 Mean/mode/median values for various data with data splitting of 80...
Table 14.3 Mean/mode/median values for various data with data splitting of 70:...
Table 14.4 Mean/mode/median values for various data with data splitting of 50:...
Chapter 16
Table 16.1 Symbolisations employed in the work.
Table 16.2 Procedure to make hash key.
Table 16.3 Parameters used in the work.
Chapter 17
Table 17.1 Industry 5.0 challenges and future directions.
Chapter 18
Table 18.1 Blockchain-internet of thing: merits and security issues.
Table 18.2 Blockchain-enabled IoV.
Chapter 20
Table 20.1 Example database.
Table 20.2 Sample database with sensitive and non-sensitive data.
Table 20.3 Table showing total result.
Table 20.4 Table showing counts.
Chapter 1
Figure 1.5.1 Signs and symptoms of Klinefelter’s syndrome [30].
Figure 1.5.2 Classification of polycystic ovarian syndrome (PCOS) based differ...
Chapter 2
Figure 2.1 Insight into DNA: gene and protein.
Figure 2.2 Major application arenas of genomics.
Figure 2.3 Process of gene therapy to cure the diseases.
Figure 2.4 Mapping of Human Gene Project (HGP).
Chapter 3
Figure 3.1 Internet of medical things in medical engineering.
Figure 3.2 Application of blockchain in healthcare.
Figure 3.3 Telemedicine applications.
Figure 3.4 Various AI technologies in healthcare.
Chapter 4
Figure 4.1 Graphical representation of the human genome.
Figure 4.2 A conceptual overview of the usages of genomics in healthcare.
Figure 4.3 A conceptual overview of the replication process.
Figure 4.4 A brief overview of the development of genomics.
Figure 4.5 Graphical representation of gene data using
Homo sapiens
genome.
Figure 4.6 A conceptual overview of the types of genomics.
Figure 4.7 Structural genomics overview.
Figure 4.8 A conceptual overview of function genomics.
Figure 4.9 A conceptual overview of metagenomics.
Figure 4.10 A conceptual overview of epigenomics.
Figure 4.11 Primary application areas of genomics.
Figure 4.12 Essential usage of biotechnology.
Figure 4.13 Using genomics for social science application.
Figure 4.14 A conceptual overview of incorporating genomic information into Me...
Figure 4.15 Application areas of DNA computing.
Figure 4.16 Nextstrain SARS-CoV-2 biological connections.
Figure 4.17 Nextstrain SARS-CoV-2 biological clade variants.
Figure 4.18 Relationship of artificial intelligence with genomics and genetics...
Chapter 5
Figure 5.1 Single nucleotide polymorphism.
Chapter 6
Figure 6.1 Flow of CNN analytics.
Figure 6.2 Genomic data analysis with CNN.
Figure 6.3 Multivariate splits of DNA pair sequencing.
Chapter 7
Figure 7.1 Number of rainy days in each month.
Figure 7.2 Amount of precipitation in a month.
Figure 7.3 Average temperature in a month.
Figure 7.4 Percentage of humidity per month.
Figure 7.5 Number of malaria occurrences per month.
Chapter 8
Figure 8.1 Applications of blockchain technology.
Figure 8.2 Architecture of blockchain technology.
Figure 8.3 (a) Data tampering attack. (b) Server and client-based blockchain t...
Figure 8.4 Initial screen of the proposed model.
Figure 8.5 Setting up path.
Figure 8.6 Hash code generated for human genome data.
Figure 8.7 Initializing 5G server.
Figure 8.8 Setting up of input data.
Figure 8.9 Final results.
Chapter 9
Figure 9.1 Problem in medical data sharing.
Figure 9.2 Secure weight sharing.
Figure 9.3 Proposed model.
Figure 9.4 Parameter server logic.
Figure 9.5 Synchronous data parallel model.
Chapter 10
Figure 10.1 Proposal methodology for our work.
Figure 10.2 Description of Modbus format.
Figure 10.3 LSTM detection model.
Figure 10.4 Performance analysis of web attack detection.
Chapter 11
Figure 11.1 DNN neuron architecture.
Figure 11.2 Proposed workflow.
Figure 11.3 Regularized capacity. https://medium.com/analytics-vidhya/regulari...
Figure 11.4 Implementation results for loss, dropout, loss.
Figure 11.5 GUI interface for heart disease prediction.
Chapter 12
Figure 12.1 (a) Original image, (b) corresponding grey image, (c) output image...
Figure 12.2 (a) Original image, (b) output images after hair removal technique...
Chapter 13
Figure 13.1 Financial transactions stored in blockchain.
Figure 13.2 Difference between centralized and decentralized application.
Figure 13.3 Application of smart contract in real estate dealing.
Figure 13.4 Application of DLT and smart contract in real estate.
Figure 13.5 Application of distributed ledger and smart contract in supply cha...
Figure 13.6 Major division of O&G Industry.
Figure 13.7 Use case of blockchain in O&G Upstream Industry.
Figure 13.8 Percentage of total expenditure spend over blockchain by each sect...
Chapter 14
Figure 14.1 Female reproductive system.
Figure 14.2 Architectural structure of the MLP.
Figure 14.3 Pseudocode for MLP implementation.
Figure 14.4 Architectural structure of the backpropagation.
Figure 14.5 Pseudocode for Backpropagation algorithm implementation.
Figure 14.6 Architectural structure of RNN.
Figure 14.7 Pseudocode for RNN implementation.
Figure 14.8 Target values analysis pie chart.
Figure 14.9 Scatter matrix and area plot of the targets.
Figure 14.10 Histogram of the age attribute.
Figure 14.11 Confusion matrix.
Figure 14.12 Model accuracy for biopsy BPN.
Figure 14.13 Confusion matrix for Hinselmann BPN.
Figure 14.14 Model accuracy for Hinselmann BPN.
Figure 14.15 Confusion matrix for citology BPN.
Figure 14.16 Model loss for citology BPN.
Figure 14.17 Confusion matrix for Schiller BPN.
Figure 14.18 Model loss for Schiller BPN.
Figure 14.19 Model accuracy for Hinselmann MLP.
Figure 14.20 Confusion matrix for Hinselmann MLP.
Figure 14.21 Accuracy graph for citology MLP.
Figure 14.22 Confusion matrix for citology results MLP.
Figure 14.23 Model accuracy for biopsy MLP.
Figure 14.24 Confusion matrix for biopsy results MLP.
Figure 14.25 Model accuracy for Schiller MLP.
Figure 14.26 Confusion matrix for Schiller results MLP.
Figure 14.27 Model accuracy for Hinselmann RNN.
Figure 14.28 Confusion matrix for Hinselmann RNN results.
Figure 14.29 Model accuracy for citology RNN.
Figure 14.30 Confusion matrix for citology RNN results.
Figure 14.31 Model accuracy for biopsy RNN.
Figure 14.32 Confusion matrix for biopsy RNN results.
Figure 14.33 Model accuracy for Schiller RNN.
Figure 14.34 Confusion matrix for the RNN Schiller results.
Chapter 15
Figure 15.1 Implementation block diagram.
Figure 15.2 Model of the smart walking stick.
Figure 15.3 System architecture of the software application.
Chapter 16
Figure 16.1 Summary of the genomic-data-based well-being administration [6].
Figure 16.2 E-health framework in stockpile [17].
Figure 16.3 IoHT architecture classification [18].
Figure 16.4 Overlay network [20].
Figure 16.5 System architecture [25].
Figure 16.6 Biomedical security system with blockchain [26].
Figure 16.7 Overview of a patient’s body with IoT medical sensors [27].
Figure 16.8 Framework of suggested system [28].
Figure 16.9 Smart healthcare system framework [29].
Figure 16.10 Merkle tree authentication for data [32].
Figure 16.11 System model [7].
Figure 16.12 Security analysis.
Chapter 17
Figure 17.1 Key-enabling technologies of Industry 5.0.
Figure 17.2 Evaluation history of Industry 5.0.
Figure 17.3 Challenges and future directions toward Industry 5.0.
Chapter 18
Figure 18.1 Artificial intelligence (AI) framework.
Figure 18.2 Internet of Things (IoT) components.
Figure 18.3 Machine learning adoption in blockchain.
Figure 18.4 Internet of Things and Artificial Intelligence as together.
Figure 18.5 Future envision on battlefield using artificial intelligence and m...
Chapter 19
Figure 19.1 Blockchain validations on artificial intelligence (AI) for retail.
Cover
Table of Contents
Series Page
Title Page
Copyright Page
Preface
Acknowledgements
Begin Reading
Index
End User License Agreement
ii
iii
iv
xix
xx
xxi
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
Scrivener Publishing100 Cummings Center, Suite 541JBeverly, MA 01915-6106
Publishers at ScrivenerMartin Scrivener ([email protected])Phillip Carmical ([email protected])
Edited by
Amit Kumar Tyagi
National Institute of Fashion Technology, New Delhi, India
This edition first published 2024 by John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA and Scrivener Publishing LLC, 100 Cummings Center, Suite 541J, Beverly, MA 01915, USA© 2024 Scrivener Publishing LLCFor more information about Scrivener publications please visit www.scrivenerpublishing.com.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.
Wiley Global Headquarters111 River Street, Hoboken, NJ 07030, USA
For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.
Limit of Liability/Disclaimer of WarrantyWhile the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials, or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read.
Library of Congress Cataloging-in-Publication Data
ISBN 978-1-394-21262-0
Cover image: Pixabay.ComCover design by Russell Richardson
Data science is a broad field encompassing some of the fastest-growing subjects in interdisciplinary statistics, mathematics and computer science. It encompasses a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision making. Data analysis has multiple facets and approaches, including diverse techniques under a variety of names, in different business, science, and social science domains. Similarly, data analytics is now required in the medial field for analyzing genomic and genetic data.
Genomics is a branch of genetics coined by Tom Roderick in 1986. Genetics is the study of a single gene, whereas genomics refers to the study of a group of genes called genomes. Genomes can be considered an instruction manual for human life. Originally the analysis of genomic data was very costly. However, due to the advancements in technology, the sequencing cost has come down significantly, so that genomic analysis can even be included in daily medical routines. The more we explore our genomes, the easier it will be to make medical decisions and cure diseases.
Genomic data does not only include personal information, but also the family ancestors’ data. Any leakage of this type of information could cause very serious issues, so data-protection is critical. Privacy laws such as GINA (Genetic Information Non-discrimination Act), HIPAA (Health Insurance Portability and Accountability Act of 1996) and GDPR (General Data Protection Regulation) help users protect their privacy by restricting the sharing of patients’ sensitive information. However, we must focus on privacy issues in an era of such rapid developments in the healthcare sectors. The main categories of privacy in healthcare include data, location, identity, and genomic privacy. Existing tools are insufficient to handle genomic data because of the large size of the datasets.
This book focuses on genomic data sources, analytical tools, and the importance of privacy preservation. Topics discussed include tensor flow and Bio-Weka, privacy laws, HIPAA, and other emerging technologies like Internet of Things, IoT-based cloud environments, cloud computing, edge computing, and blockchain technology for smart applications. The book starts with a basic introduction to genomes, genomics, genetics, transcriptomes, proteomes, and other basic concepts of modern molecular biology. It concludes with future predictions for genomic and genomic privacy, emerging technologies, and applications.
Amit Kumar Tyagi
First, we extend our gratitude to our family members, friends, and supervisors who stood by us as advisors during the completion of this book. Also, we thank our almighty God who inspired us to write this book. Furthermore, we thank Wiley and Scrivener Publishing, who have provided continuous support; and our colleagues with whom we have worked inside the college and university system, as well as those outside of academia who have provided their endless support toward completing this book.
Finally, we wish to thank our Respected Madam, Prof. G Aghila, Prof. Siva Sathya, our Respected Sir Prof. N Sreenath, and Prof. Aswani Kumar Cherukuri, for their valuable input and help in completing this book.
Amit Kumar Tyagi
Mahreen Fatima1, Sana Zia2, Maheen Murtaza3, Asyia Shafique4, Afshan Muneer3, Junaid Sattar5, Muhammad Ashir Nabeel6 and Amjad Islam Aqib7*
1Faculty of Biosciences, Cholistan University of Veterinary and Animal Sciences, Bahawalpur, Pakistan
2Department of Zoology, Government Sadiq College Women, University, Bahawalpur, Pakistan
3Department of Zoology, Cholistan University of Veterinary and Animal Sciences Bahawalpur, Pakistan
4Department of Clinical Medicine and Surgery, University of Agriculture, Faisalabad, Pakistan
5Faculty of Veterinary Sciences, Choliatan University of Veterinary and Animal Sciences, Bahawalpur, Pakistan
6Animal Sciences, University of Illinois Urbana Champaign, Urbana, United States of America
7Department of Medicine, Cholistan University of Veterinary and Animal Sciences, Bahawalpur, Pakistan
Abstract
Genomic research is a relatively new field in biotechnology, with DNA sequencing as its essential technology. Genomic research is progressing quickly due to the accessibility of advanced technologies, which enables genome-wide sequencing to address biological questions. During the last decade, genomic studies have evolved into potential tools for understanding human disease genetics. It was essential to organize a sequence of 3 billion letter codes in a cost-effective manner after the evolution of the human project. By producing large amounts of sequencing data at a low cost, this breakthrough enabled the emergence of a wide variety of biomedical applications after the completion of the project. For the interpretation of the human genome, these technological advancements have enabled the sequencing of various vertebrate genomes. In addition to allowing the study of vertebrate genome evolution, this sequencing will also benefit human medicine and comparative genomics. The focus of this chapter is to introduce and review the basic aspects of genomics, as well as its role in the pharmaceutical industry.
Keywords: Genomic, genetic, technology, biomedical application, pharmaceutical industry
There are slightly more than 20,000 human protein-coding genes, but every one of these classifications typically codes for numerous proteins thanks to mechanisms like uncommon concerning, differing strand transcripts, and others. There may be up to five determined transcripts per gene sequence, giving some confirmation. Two percent part of the human genome’s DNA balance contains the real protein-encoding orders. It is now generally acknowledged that there are tens of thousands of genomic areas that encode “noncoding RNA transcripts.” These RNAs display a role in the control of messenger RNA translation and gene entrance (mRNAs) [1]. Given how the chromatin state moves gene appearance, it is strong that epigenomic cause changes to histone chemistry and DNA methylation levels can have a significant impact on transcription. Once more, this is a cutting-edge field of cellular biology where the potential position of activity and inactivity needs to be widely explored. If genomics is the study of the assets of the genome, genetics is the study of how characters or phenotypes are approved down through groups [2]. The identification of genetic differences related to neuropsychiatric disorders and treatment significances has thus amplified self-assurance that these findings will rapidly be functional in the clinic to improve diagnosis, disease risk forecast, and patient reply to drug therapy. Slower DNA segments can be sequenced using the shotgun method, clone by clone, whole genome, Maxam Gilbert, and Sanger sequencing methods. Sanger chemistry, the “original” sequencing method, reads through a DNA template shaped during DNA synthesis using specially branded nucleotides [3]. The Sanger method is used read 1000 to 1200 base pairs (bp) thanks to a number of practical developments, but it is still imperfect to the 2-kilo base pair (kbps) [4]. In this book chapter focus, the hub of genomics, Genome Sequencing Methods, Variation of Genome Sequencing, Diseases and Disorders, and Future Prospects.
There are slightly more than 20,000 human protein-coding genes. Still, every one of these classifications typically codes for numerous proteins thanks to mechanisms like uncommon concerning, differing strand transcripts, and others. There may be up to five determined transcripts per gene sequence, giving some confirmation. Two percent of the human genome’s DNA balance contains the real protein-encoding orders. It is now generally acknowledged that tens of thousands of genomic areas encode “noncoding RNA transcripts.” These RNAs display a role in controlling messenger RNA translation and gene entrance (mRNAs) [5]. Given how the chromatin state moves gene appearance, it is strong that epigenomic cause changes to histone chemistry and DNA methylation levels can significantly impact transcription. Once more, this is a cutting-edge field of cellular biology where the potential position of activity and inactivity needs to be widely explored. If genomics is the study of the assets of the genome, genetics is the study of how characters or phenotypes are approved down through groups [6]. Identifying genetic differences related to neuropsychiatric disorders and treatment significance has thus amplified self-assurance that these findings will rapidly be functional in the clinic to improve diagnosis, disease risk forecast, and patient reply to drug therapy. Slower DNA segments can be sequenced using the shotgun method, clone by clone, whole genome, Maxam Gilbert, and Sanger sequencing methods. Sanger chemistry, the “original” sequencing method, reads through a DNA template shaped during DNA synthesis using specially branded nucleotides [7]. The Sanger method is used 1000 to 1200 base pairs (bp) thanks to some practical developments, but it is still imperfect to the 2 kbps [8]. The chapter aims to review some hubs of genomics, genome sequencing methods, variations of genome sequencing, diseases and disorders, and prospects.
A. Phenotype
The hypothetical molecular phenotypes comprise organism-level phenotypes like diseases and some molecular variations. For example, these include the over- or under-expression of specific genes [9]. So, one of the first steps is classifying the molecular phenotypes that go sideways with them. In the past 10 years, gene appearance has advanced a molecular-level trait used as a molecular phenotype to classify diseases, classify drug targets, and infer gene-gene exchanges [10].
The consequences are typically more consistent and humbler when gene appearance variations are carefully examined under various circumstances. Furthermore, it can be difficult to identify a gene’s function outside its context when its function is obscure [11]. The amplified statistical power of a module-based method also makes it credible to identify a disturbed module even when individual genes within the module may not have suffered statistically significant perturbation [12].
B. Genotype
The sympathetic of phenotype models—whose fluctuations are linked in expression with changes in phenotype—was enclosed in the section before this one. This section focuses on the gene foundations of suffering and the substitute networks that these supports define. According to new studies, genomic changes vary greatly in complex diseases like cancer and neurological disorders. Theoretically, the changed or mutant genes may be part of the same pathways, collectively dysregulating those pathways. For example, O’Roak et al. [30] discovered that 39% (49 of 126) of the most severe or disordered de novo mutations map to a highly organized network of the proteins—catenin and chromatin remodeling in sporadic autism [33]. These strategies focus on identifying genotypic modules, or subnetworks, that are enriched with genes having genetic changes linked to disease. This mindset is common in the research of somatic cell mutations in cancer, which are the primary causes of the disease [13].
C. Gene Environmental Interaction
The environment may assist (or prevent) prospects for the entrance of genotype-guided social penchants in the most open and perhaps most overall sense. For case, some examples of dormant variable G-E interactions previously mentioned imply this. There, the inspiration of religion, public features (such as urban/rural), peer substance use, or parental guideline on smoking or drinking varied. Furthermore, it has been recognized that some of these environmental factors can decrease the social properties of specific polymorphisms [14]. Other environmental issues that play a vital role in G-E interactions include shocking events like child exploitation that cause penetrating delicate and physical responses. New data proposes that these replies affect the organic paths that collaborate with genetic effects, maybe uniformly varying how the genes are expressed. These properties can last for the mainstream or all of a person’s lifespan due to specific genomic variations, or they can be passed and linked with environmental contact [15].
D. Epigenetics
DNA sequence differences are not compound in epigenetic changes. Although, throughout mitosis, they regularly pass from one cell to its daughter cells. It has also been documented that some epigenetic changes can be approved down the germ lines from one group to the next. It is extensively accepted that exposure to nutrients, cellular insults, stress, etc., can lead to epigenetic alterations at any age [16].
A chromosome’s double-stranded DNA is packed very firmly, as previously stated. Each nucleosome is related to the one after it by a piece of free DNA that has been thankful by the linker histone, H1. The histone of nucleosomes can experience a diversity of chemical alterations, such as acetylation, methylation, and phosphorylation. The patterns of histone modifications can affect how chromatin is arranged, potentially affecting transcription activity. Methylating a cytosine base in the 5′ positions can also change DNA [15].
A. Clone by Clone
Smaller pieces of the genome must be duplicated and put into bacteria using this technique. The deliberate genome’s desirable 150,000 base pairs can be invented in identical duplicates, or “clones,” created by growing the bacteria. The injected DNA is then further fragmented into 500 base pair portions that overlay in each clone. These more flattened supplements have a sequence. After sequencing, the overlying sections are used to put the clone back calmly. Using Sanger sequencing, this method was used to order the first human genome. While it takes a lot of time and money, this method works [17].
B. Maxam-Gilbert Sequencing
The Maxam-Gilbert method, this is genuine by subcloning. The five consecutive steps—base-specific alteration, resting the reactions, ethanol rainfall and centrifugation in arrangement, piperidine cleavage, and recurrent reduced-pressure evaporation—are exceptionally laborious, though. The option of losing the DNA pellet during the ethanol precipitation and centrifuging is additional issue. Additionally, during the ethanol precipitation, hydrazine might co-precipitate with DNA and affect the reactions’ specificity [18].
Advantages and Disadvantages
The DNA template utilized in the Maxim-Gilbert sequencing method can be single-stranded or double-stranded, which is its chief draw. At one point, the Maxam-Gilbert approach was selected over the Sanger method since the later wanted to clone the single-stranded DNA for each read start. Moreover, DNA protein connections and epigenetic DNA alterations can be studied using the Maxam-Gilbert approach. Maxam-Gilbert sequencing was primarily forced by the use of dangerous substances and methods like radiolabeling and X-rays. The method was disapproved since it obligatory the use of hydrazine, a known neurotoxin, and was hard to scale up and manage [1].
C. Sanger Sequencing Method
The Sanger technique for genetic or DNA sequencing has been found main request in the field of veterinary diagnostics.
Outdated Sanger sequencing lasts to be the most popular sequencing approach utilized in VDLs for sequence verification, assay watching, and as the foundation for many phylogenetic analyses. It not only helps as the basis for fresher and automatic procedures. In this method, complementary DNA that had previously been prolonged by the DNA polymerase enzyme using any combination of the four deoxynucleotide triphosphates (dNTPs: dATP, dGTP, dCTP, and dTTP) or chain-terminating dideoxynucleotide triphosphates is annealed using an oligonucleotide primer. When rate-limiting amounts of the ddNTPs are added, the elongation synthesis becomes stationary and produces distinct DNA fragments of different lengths. Sanger sequencing technology can currently produce widely used NA sequences up to 800 to 1,000 bp. The method’s most significant disadvantages characteristically include primer binding-induced lowermost quality sequence from 15 to 40 bp and the incapability to notice single base pair variations in longer segments. Seeing these limits, both commercial and open-source sequence analysis software is emerging to contribution users in automatically classifying and eliminating low-quality data. The capillary gel electrophoresis method used in Sanger sequencing is the focus of the VDL rules described here [19].
Advantages and Disadvantages
Researchers were able to find mutations and the fundamental cause of genetic illnesses with the aid of Sanger sequencing. It is the most actual technique for locating brief tandem repeats and sequencing a single gene. However, this method’s major drawback is the length of time it takes, which results in a low throughput. Short DNA sequences (between 300 and 1000 base pairs) can only be processed using this method one at a time [19].
Principle of Genome Sequencing and Assembly
Currently, a shotgun sequencing approach is used for the majority of genome projects. Genomic DNA is first cut into a variety of tiny, random fragments. These are independently sequenced to a specific length, depending on the technology. The resulting sequence reads are then put back together into longer uninterrupted sequence stretches (contigs) using potent computer algorithms, a procedure known as de novo assembly. High sequencing coverage is necessary to ensure that the sequence reads at each position in the genome overlap sufficiently for proper assembly (or read depth). Naturally, more overlap can be anticipated for longer sequence reads, which lowers the necessary raw read depth. Longer fragments (a few hundred base pairs) are typically sequenced from both ends (paired-end sequencing) to provide more details on where the reads should be placed in the assembled sequence [20]. The library’s anticipated fragment length tells us how far apart the two contigs are physically, and the blank space is filled with the meaningless base-pair character “N.” The missing base-pair information is filled in by later gap-closing techniques, ideally using long reads that read across repetitive sequences. The final step frequently involves joining the scaffolds into linkage groups or putting them on chromosomes. Unquestionably, the best method for arranging and orienting scaffolds into longer sequence blocks is to create genetic maps from crosses or pedigree data [21].
A. Single Nucleotide Polymorphism
In single nucleotide polymorphisms, a specific nucleotide site differs from others [22]. DNA molecules may have different nucleotide pairings at the same sites in different populations, such as a T-A base pair at one nucleotide site and a C-G base pair at another. Statistically, significant differences are referred to as SNPs. There are two alleles associated with the SNP, for which the population may have three genotypes: homozygous or heterozygous chromosomes, T-A on one chromosome, and C-G on the homologous chromosome [23]. Alleles are enclosed in quotation marks because they do not need to be in coding arrangements or even genes [24]. Genetic variations that affect less than 1% of DNA molecules in a population are excluded from the SNP definition because they differ at a nucleotide site. A genetic variant that occurs too rarely in a population is not as useful for genetic analysis as a variant that occurs more frequently in a population [25].
B. INDELs (Insertion and Deletion)
SNPs, on the other hand, have received extensive research compared to other forms of genetic variation in humans [36]. The discovery of INDELs has lagged far behind the discovery of SNPs, and only a small number of them have been found. INDELs can be classified into five major categories: (1) insertions and deletions of a single base pair, (2) monomeric expansions of two to five base pairs, (3) multiple expansions of a repeat unit between two and fifteen bases, (4) transposon insertions and (5) INDELs with random DNA sequences [23].
C. Copy Number Variation
Genetic association studies often evaluate SNPs, which are differences between individuals of the same species at a particular genomic location. Copy number variations (CNVs) in DNA sequences are common in naturally occurring organisms and have functional significance, their full significance has not yet been fully understood. Recombination and replication processes, as well as a higher rate of de novo locus-specific mutations than SNPs, all contribute to the production of CNVs. Through chromosomal segment deletions and duplications, Hastings et al. [13] described the mechanisms of change that result in CNV evolution in humans.
A. Multiple Sclerosis
A non-traumatic disabling disease most commonly affecting young adults is multiple sclerosis (MS). Both developed and developing countries are experiencing an increase in MS incidence and prevalence, with no clear cause identified. Ascherio A [26] describes MS as a complex disease caused by many genes, as well as several well-described environmental factors, including vitamin D exposure. In recent years, however, B-cell targeted therapies have challenged the conventional dogma about T-cell autoimmune disease. Traditionally, MS has been considered a two-stage disease involving relapsing–remitting disease and secondary or primary progressive disease caused by delayed neurodegeneration [27]. In addition to providing neural insulation and saltatory conduction of neurodegeneration and conduction failure. Several genes have been identified as responsible for CMT, which is a form of inherited peripheral neuropathy. A component of the age-related progression of MS can be explained by comorbid diseases, such as smoking and vascular disease. The comorbid disease makes patients more likely to progress more rapidly [28].
B. Klinefelter Syndrome
A common sex chromosome disorder and genetic irregularity, (KS) are chromosome disorders [29]. The traditional phenotype of the syndrome, defined as tall stature, small testes, gynecomastia, gynoid hips, sparse body hair, primary hypogonadism, mental retardation, and sparse body hair, has been shown to be rarely observed in the clinical setting (refer Figure 1.5.1).
Figure 1.5.1 Signs and symptoms of Klinefelter’s syndrome [30].
It may be due to a combination of the severity of genetic defects that the final KS phenotype develops. There are a variety of phenotypes, ranging from hypogonadotropic hypogonadism, infertility, neurocognitive deficits, as well as severe comorbidities [30].
C. Down Syndrome
Around 1 in every 600 to 700 live births has Down syndrome (DS) or trisomy 21, which reasons to basic bright incapability. Chromosome 21 trisomy is the reason for DS, which is produced by a total or partial trisomy of the autosomal chromosome 21. DS population life expectation has increased evocatively over the past numerous periods. In difference to other populations, individuals with DS still experience higher mortality rates. Characteristics present in all DS populations are craniofacial irregularities, and hypotonia in early beginning, there are numerous other preserved [31]. A DS baby have a big toe, irregular fingerprint pattern, and short fingers. Moreover, Robertson Ian translocations and is chromosomal or ring chromosomal disorders can cause the condition. In the process of developing an egg sperm, an chromosome refers to two long arms of the chromosome unraveling collected rather than the long arm and the short arm unravelling together. A mosaic is a mis division of cells after fertilization that happens at a sure point during cell division. In mosaic DS, two lineages of cells donate to tissues and organs; one lineage has usual chromosomes, while the other has 21 extra chromosomes [32].
D. Polycystic Ovarian Syndrome
There is an extremely high incidence and prevalence of polycystic ovary syndrome. Women in early to late-night reproductive phases universally are pretentious by it, but its prevalence varies extensively according to race and society. For example, South Asians are more likely to hurt from it than Caucasians. Asian women have a higher polycystic ovarian syndrome (PCOS) incidence (52% versus 20%–25%) than Western Caucasian women. As informed by the World Health Organization (WHO), PCOS affected 116 million women globally in 2012 (4%–12%) and was predictable to reach 26% by 2020 [33]. Rather than being a disease, PCOS is a disorder characterized by engorged ovaries and cysts more than ten in number. It is supposed that these cysts are the residues of follicles that have not advanced. An increase in thickening of the ovary wall happens as the disorder develops, and checks ripened follicles from being free. ANOVA, infertility, and disturbances in the menstrual cycle are all considered symptoms of PCOS [34].
PCOS produced by insulin confrontation (refer Figure 1.5.2): (1) High insulin levels are the important cause of PCOS. (2) Adrenal PCOS: This condition is shaped by adrenal excretions being inspired during puberty; patients with this disorder usually experience more stress due to extra levels of DHEAS. (3) Inflammatory PCOS: PCOS patients tend to have smaller chronic inflammation. (4) Post-pill PCOS: e.g., as a result of hormonal inequalities and contraceptive pills. Presently, PCOS can be achieved with Allopathic therapy, herbal therapy, lifestyle variations, and dietary alterations [35].
Figure 1.5.2 Classification of polycystic ovarian syndrome (PCOS) based different factors [35].
One reference genome cannot accurately reflect the diversity of human genetics, given the complexity of genetic variation. Normal genomes from diverse human populations, but especially those of African origin, who have the most genetic variety, must be sequenced and assembled [36].
Several new reference genomes are being produced with the help of long-read DNA sequencing in combination with short-read error correction. According to our current discovery rate, sequencing 300 human genomes in this manner will lead to a double number of structural variants that have been identified (at the DNA sequence level), identifying, in theory, the majority of common structural variants [37]. In millions of Illumina genomes that have already been generated, structural variation sequence resolution allows better genotyping of such alleles, allowing the discovery of new relationships [38].
A number of pathogenic variations are located outside of coding regions, which can have a negative effect on gene expression and translation. Despite the fact that exome sequencing is less expensive and provides greater sample sizes and power, it provides little information on regulatory modifications and hampers the discovery of minor structural variants even within coding sequences [39]. It is challenging to identify harmful variants linked with noncoding mutations. It may, however, be possible to understand noncoding regulatory mutations and their contribution to common and rare genetic diseases if structural variants are detected on a systematic basis as they are more likely than SNVs to be dysfunctional and alter gene expression [40]. In order to detect structural variations, whole genome sequencing is crucial. Fully phased long-read genome sequences are expected to yield 2.8 times as many structural variations as Illumina whole-genome sequencing, plus a 30% boost over long-read callers that do not phase. In other words, rather than a 3-Gbps genome, we should consider a 6-Gbps genome, where both parental haplotypes are perfectly sequenced and assembled [41].
It is essential to understand the genetic basis of de novo mutations, DNA sequence conservation between species, and selection at a given locus before one can interpret variants. Methods that utilize these factors have been successful in identifying potential pathogenic variants by adopting specific methods [42]. They require, however, that variation both within and between species be equally described. Non-human primate, mammalian, and vertebrate genomes must be sequenced with the same rigor as extrahuman genomes because orthologous DNA sequences are historically poorly aligned and have rates of mutation that differ by orders of magnitude [43].
Conventional human genome analysis utilizing short-read sequencing data catches only around 85% of the genome and ignores some of the most variable areas, excluding them from association testing [44, 45]. To describe all human chromosomes, including acrocentric, telomeric, centromeric, and segmentally duplicated DNA, from telomere to telomere is a simple goal. Sequencing platforms for long reads and ultralong reads [46].
This chapter gives a detailed review on genomics and genetics. This chapter describes the perpetuity of genomics to genetics in the practice of treatment diagnosis of disease. In the future, by using genetic information, a disease can be diagnosed at very early stages, and it can also be cured. Researchers need to become very active and savvy in interpreting genetic information that could be used for in medical and public health. Researcher, professional academia, and organization should communicate collectively to conduct the necessary research.
1. Jain, M., Olsen, H.E., Turner, D.J., Stoddart, D., Bulazel, K.V., Paten, B., Miga, K.H., Linear assembly of a human centromere on the Y chromosome.
Nat. Biotechnol.
, 36, 4, 321–323, 2018.
2. Vollger, M.R., Dishuck, P.C., Sorensen, M., Welch, A.E., Dang, V., Dougherty, M.L., Eichler, E.E., Long-read sequence and assembly of segmental duplications.
Nat. Methods
, 16, 1, 88–94, 2019.
3. Kronenberg, Z.N., Fiddes, I.T., Gordon, D., Murali, S., Cantsilieris, S., Meyerson, O.S., Eichler, E.E., High-resolution comparative analysis of great ape genomes.
Science
, 360, 6393, eaar6343, 2018.
4. Kircher, M., Witten, D.M., Jain, P., O’roak, B.J., Cooper, G.M., Shendure, J., A general framework for estimating the relative pathogenicity of human genetic variants.
Nat. Genet.
, 46, 3, 310–315, 2014.
5. Ebert, P., Audano, P.A., Zhu, Q., Rodriguez-Martin, B., Porubsky, D., Bonder, M.J., Eichler, E.E., Haplotype-resolved diverse human genomes and integrated analysis of structural variation.
Science
, 372, 6537, eabf7117, 2021.
6. Koren, S., Rhie, A., Walenz, B.P., Dilthey, A.T., Bickhart, D.M., Kingan, S.B., Phillippy, A.M., De novo assembly of haplotype-resolved genomes with trio binning.
Nat. Biotechnol.
, 36, 12, 1174–1182, 2018.
7. Turner, T.N., Coe, B.P., Dickel, D.E., Hoekzema, K., Nelson, B.J., Zody, M.C., Eichler, E.E., Genomic patterns of de novo mutation in simplex autism.
Cell
, 171, 3, 710–722, 2017.
8. Audano, P.A., Sulovari, A., Graves-Lindsay, T.A., Cantsilieris, S., Sorensen, M., Welch, A.E., Eichler, E.E., Characterizing the major structural variant alleles of the human genome.
Cell
, 176, 3, 663–675, 2019.
9. McClellan, J.M., Lehner, T., King, M.C., Gene discovery for complex traits: Lessons from Africa.
Cell
, 171, 2, 261–264, 2017.
10. Huddleston, J., Chaisson, M.J., Steinberg, K.M., Warren, W., Hoekzema, K., Gordon, D., Eichler, E.E., Discovery and genotyping of structural variation from long-read haploid genome sequence data.
Genome Res.
, 27, 5, 677–685, 2017.
11. Conrad, D.F., Pinto, D., Redon, R., Feuk, L., Gokcumen, O., Zhang, Y., Hurles, M.E., Origins and functional impact of copy number variation in the human genome.
Nature
, 464, 7289, 704–712, 2010.
12. Stankiewicz, P. and Lupski, J.R., Structural variation in the human genome and its role in disease.
Annu. Rev. Med.
, 61, 437–455, 2010.
13. Hastings, P.J., Lupski, J.R., Rosenberg, S.M., Ira, G., Mechanisms of change in gene copy number.
Nat. Rev. Genet.
, 10, 8, 551–564, 2009.
14. Belsky, D.W., Moffitt, T.E., Corcoran, D.L., Domingue, B., Harrington, H., Hogan, S., Caspi, A., The genetics of success: How single-nucleotide polymorphisms associated with educational attainment relate to life-course development.
Psychol. Sci.
, 27, 7, 957–972, 2016.
15. Belsky, D.W., Moffitt, T.E., Corcoran, D.L., Domingue, B., Harrington, H., Hogan, S., Caspi, A., The genetics of success: How single-nucleotide polymorphisms associated with educational attainment relate to life-course development.
Psychol. Sci.
, 27, 7, 957–972, 2016.
16. Michailidou, K., Lindström, S., Dennis, J., Beesley, J., Hui, S., Kar, S., Humphreys, K., Association analysis identifies 65 new breast cancer risk loci.
Nature
, 551, 7678, 92–94, 2017.
17. Berger, J., Suzuki, T., Senti, K.A., Stubbs, J., Schaffner, G., Dickson, B.J., Genetic mapping with SNP markers in Drosophila.
Nat. Gen.
, 29, 4, 475– 481, 2001.
18. Wicks, S.R., Yeh, R.T., Gish, W.R., Waterston, R.H., Plasterk, R.H., Rapid gene mapping in Caenorhabditis elegans using a high density polymorphism map.
Nat. Gen.
, 28, 2, 160–164, 2001.
19. Dawson, E., Chen, Y., Hunt, S., Smink, L.J., Hunt, A., Rice, K., Dunham, I., A SNP resource for human chromosome 22: Extracting dense clusters of SNPs from the genomic sequence.
Genome Res.
, 11, 1, 170–178, 2001.
20. Hendriksen, R.S., Bortolaia, V., Tate, H., Tyson, G.H., Aarestrup, F.M., McDermott, P.F., Using genomics to track global antimicrobial resistance.
Front. Public Health
, 7, 242, 2019.
21. Malla, M.A., Dubey, A., Kumar, A., Yadav, S., Hashem, A., Abd_Allah, E.F., Exploring the human microbiome: The potential future role of next-generation sequencing in disease diagnosis and treatment.
Front. Immunol.
, 9, 2868, 2019.
22. Shokralla, S., Spall, J.L., Gibson, J.F., Hajibabaei, M., Next-generation sequencing technologies for environmental DNA research.
Mol. Ecol.
, 21, 8, 1794–1805, 2012.
23. Mahmoud, A.M., An overview of epigenetics in obesity: The role of lifestyle and therapeutic interventions.
Int. J. Mol. Sci.
, 23, 3, 1341, 2022.
24. Agustí, A., Melén, E., DeMeo, D.L., Breyer-Kohansal, R., Faner, R., Pathogenesis of chronic obstructive pulmonary disease: Understanding the contributions of gene–environment interactions across the lifespan.
Lancet Respir. Med.
, 10, 512–524, 2022.
25. Chaste, P. and Leboyer, M., Autism risk factors: Genes, environment, and gene-environment interactions.
Dialogues Clin. Neurosci.
, 14, 281–292, 2022.
26. Ascherio, A., Environmental factors in multiple sclerosis.
Expert Review of Neurotherapeutics
, 13(sup2), 3–9, 2013
27. Zahid, G., Aka Kaçar, Y., Dönmez, D., Küden, A., Giordani, T., Perspectives and recent progress of genome-wide association studies (GWAS) in fruits.
Mol. Biol. Rep.
, 49, 6, 5341–5352, 2022.
28. McArthur, E., Rinker, D.C., Gilbertson, E.N., Fudenberg, G., Pittman, M., Keough, ... K., Capra, J.A.,
Reconstructing the 3D genome organization of Neanderthals reveals that chromatin folding shaped phenotypic and sequence divergence
, Biorxiv, 2022, 2022-02.
29. Pudlo, N.A., Urs, K., Crawford, R., Pirani, A., Atherly, T., Jimenez, R., Martens, E.C., Phenotypic and genomic diversification in complex carbohydrate-degrading human gut bacteria.
Msystems
, 7, 1, e00947–21, 2022.
30. O’Roak, B.J., Vives, L., Girirajan, S., Karakoc, E., Krumm, N., Coe, B.P., Eichler, E.E., Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations.
Nature
, 485, 7397, 246–250, 2012.
31. Hendriksen, R.S., Bortolaia, V., Tate, H., Tyson, G.H., Aarestrup, F.M., McDermott, P.F., Using genomics to track global antimicrobial resistance.
Front. Public Health
, 7, 242, 2019.
32. Koren, S., Rhie, A., Walenz, B.P.
et al.
, De novo assembly of haplotyperesolved ge nomes with trio binning.
Nat. Biotechnol.
, 36, 12, 1174–1182, 2018.
33. Gilman, S.R., Iossifov, I., Levy, D., Ronemus, M., Wigler, M., Vitkup, D., Rare de novo variants associated with autism implicate a large functional network of genes involved in formation and function of synapses.
Neuron
, 70, 5, 898– 907, 2011.
34. Greenfield, A.L. and Hauser, S.L., B-cell therapy for multiple sclerosis: Entering an era.
Ann. Neurol.
, 83, 1, 13–26, 2018.
35. Coles, A.J., Cox, A., Le Page, E.
et al.
, The window of therapeutic opportunity in multiple sclerosis: Evidence from monoclonal antibody therapy.
J. Neurol.
, 253, 98–108, 2006.
36. Handel, A.E., Williamson, A.J., Disanto, G., Dobson, R., Giovannoni, G., Ramagopalan, S.V., Smoking and multiple sclerosis: An updated meta-analysis.
PLoS One
, 6, 1, e16149, 2011.
37. Gravholt, C.H., Chang, S., Wallentin, M., Fedder, J., Moore, P., Skakkebaek, A., Klinefelter syndrome - integrating genetics, neuropsychology and endocrinology.
Endocr. Rev.
, 39, 389– 423, 2018.
38. Bonomi, M., Rochira, V., Pasquali, D., Balercia, G., Jannini, E.A., Ferlin, A., Klinefelter syndrome (KS): Genetics, clinical phenotype and hypogonadism.
J. Endocrinol. Invest.
, 40, 123–134, 2017.
39. De Graaf, G., Buckley, F., Skotko, B.G., Estimation of the number of people with down syndrome in the United States.
Genet. Med.
, 19, 4, 439–447, 2017.
40. Ellegren, H., Smeds, L., Burri, R., Olason, P.I., Backström, N., Kawakami, T., Wolf, J.B., The genomic landscape of species divergence in Ficedula flycatchers.
Nature
, 491, 7426, 756–760, 2012.
41. Ellegren, H., Genome sequencing and population genomics in non-model organisms.
Trends Ecol. Evol.
, 29, 1, 51–63, 2014.
42. Correale, J. and Gaitan, M.I., Multiple sclerosis and environmental factors: The role of vitamin D, parasites, and Epstein–barr virus infection.
Acta Neurol. Scand.
, 132, 46–55, 2015.
43. Santos, F., Gómez-Manzo, S., Sierra-Palacios, E., González-Valdez, A., Castillo-Villanueva, A., Reyes-Vivas, H., Marcial-Quino, J., Purification, concentration and recovery of small fragments of DNA from Giardia lamblia and their use for other molecular techniques.
MethodsX
, 4, 289–296, 2017.
44. Vandin, F., Upfal, E., Raphael, B.J., Algorithms for detecting significantly mutated pathways in cancer.
J. Comput. Biol.
, 18, 507–522, 2011.
45. Crossley, B.M., Bai, J., Glaser, A., Maes, R., Porter, E., Killian, M.L., Clement, T., Toohey-Kurth, K., Guidelines for sanger sequencing and molecular assay monitoring.
J. Vet. Diagn. Invest.: Off. Publ. Am. Assoc. Veterinary Lab. Diagnosticians Inc.
, 32, 6, 767–775, 2020.
https://doi.org/10.1177/1040638720905833
.
46. Hoehe, M.R. and Morris-Rosendahl, D.J., The role of genetics and genomics in clinical psychiatry.
Dialogues Clin. Neurosci.
, 20, 169–177, 2022.
*
Corresponding author
: