E-Book
216,99 €

Privacy Preservation of Genomic and Medical Data E-Book

0,0

216,99 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.

Herausgeber: John Wiley & Sons
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

PRIVACY PRESERVATION of GENOMIC and MEDICAL DATA Discusses topics concerning the privacy preservation of genomic data in the digital era, including data security, data standards, and privacy laws so that researchers in biomedical informatics, computer privacy and ELSI can assess the latest advances in privacy-preserving techniques for the protection of human genomic data. Privacy Preservation of Genomic and Medical Data focuses on genomic data sources, analytical tools, and the importance of privacy preservation. Topics discussed include tensor flow and Bio-Weka, privacy laws, HIPAA, and other emerging technologies like Internet of Things, IoT-based cloud environments, cloud computing, edge computing, and blockchain technology for smart applications. The book starts with an introduction to genomes, genomics, genetics, transcriptomes, proteomes, and other basic concepts of modern molecular biology. DNA sequencing methodology, DNA-binding proteins, and other related terms concerning genomes and genetics, and the privacy issues are discussed in detail. The book also focuses on genomic data sources, analyzing tools, and the importance of privacy preservation. It concludes with future predictions for genomic and genomic privacy, emerging technologies, and applications. Audience Researchers in information technology, data mining, health informatics and health technologies, clinical informatics, bioinformatics, security and privacy in healthcare, as well as health policy developers in public and private health departments and public health.

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 822

Veröffentlichungsjahr: 2023

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Ähnliche

BESTSELLER

Das rote Zimmer - Krimi Hörbuch ( Atticus 3 )

Mark Dawson

BESTSELLER

Mörderfinder - Die Spur der Mädchen - Max Bischoff, Band 1 (Ungekürzte Lesung)

Arno Strobel

BESTSELLER

Asrai - Das Herz der Drachen

Liane Mars

BESTSELLER

Das Leben fing im Sommer an (Ungekürzte Lesung)

Christoph Kramer

BESTSELLER

Die Tochter des Serienkillers - Die Familie des Serienkillers, Teil 2 (Ungekürzt)

Alice Hunter

BESTSELLER

Das zerrissene Herz - Empire of Sins and Souls, Band 3 (Ungekürzte Lesung)

Beril Kehribar

BESTSELLER

Dämonenmagie und ein Martini

Annette Marie

BESTSELLER

Russian Roulette - Letzte Kugel

Don Both

BESTSELLER

Nightworld Academy 6 - Die Schule für Hexen, Vampire und Werwölfe

LJ Swallow

BESTSELLER

Die Rache der Eltern (Ungekürzt)

Daniel Hurst

BESTSELLER

Asrai - Die Magie der Drachen

Liane Mars

BESTSELLER

DARK DUTY: Tödliche Sehnsucht

J. S. Wonda

BESTSELLER

I Am Fury (Ungekürzte Lesung)

Emily Varga

BESTSELLER

Merciful Death - Erbarme dich ihrer - Die Mercy Kilpatrick Serie, Band 1 (Ungekürzte Lesung)

Кендра Эллиот

BESTSELLER

Stille Zeugen (Ein Fall für Engel und Sander, Band 1)

Angela Lautenschläger

BESTSELLER

Hunting Souls (Romantasy-Dilogie, Bd. 2) - Unsere verfluchten Herzen (Ungekürzte Lesung)

Tina Köpke

BESTSELLER

Silber - Das dritte Buch der Träume (Ungekürzte Lesung)

Im Augenblick - Seelenmagie, Band 3 (Ungekürzt)

Alana Falk

BESTSELLER

A Sea of Starlight: Take Me Back, Hold Me Close, Bring Me Home (3in1-Bundle)

Alina A. E. Maurer

Leseprobe

Cover

Series Page

Title Page

Preface

Acknowledgements

Part 1: Fundamentals

1 Introduction to Genomics and Genetics

1.1 Introduction

1.2 Hub of Genomics

1.3 Genome Sequencing Methods

1.4 Variation of Genome Sequencing

1.5 Diseases and Disorders

1.6 Future Prospects

1.7 Conclusion

References

2 An Overview of Genomics and Frontiers in Genetics for Smart Era

2.1 Introduction

2.2 Application of Genomes—The Frontiers in Genetics

2.3 Genomics in Military

2.4 Genomics in Medicine

2.5 International Projects

2.6 Case Study

2.7 Conclusion

References

3 Technical Trends in Public Healthcare and Medical Engineering

3.1 Introduction

3.2 Background Work

3.3 Current Scenario of Public Healthcare System and Medical Engineering

3.4 Role of AI in Healthcare

3.5 Technological Analysis for Healthcare System and Medical Engineering

3.6 Future Aspects of AI in Healthcare and Medical Engineering

3.7 Conclusion

3.8 Future Aspect

References

4 Role of Genomics in Smart Era and Its Application in COVID-19

4.1 Introduction

4.2 Basics of Genomics

4.3 Evolution of Genomics

4.4 Characteristics of Genomics and DNA Computing

4.5 Types of Genomics

4.6 Application Area of Genomics

4.7 Application of DNA Computing

4.8 Genomics and DNA Computing in COVID-19 Epidemic

4.9 Issues in Genomics and DNA Computing

4.10 Tools and Technology Used in Genomics Systems

4.11 Role of AI Technology Used in Genomics Systems

4.12 Related Works and Challenges

4.13 Future Research Dimension

4.14 Conclusion

References

Part 2: Methods and Applications

5 Novel Cutting-Edge Security Tools for Medical and Genomic Data With Privacy Preservation Techniques

5.1 Introduction

5.2 Background of Genomic

5.3 Literature Review

5.4 Highlights of the Proposed Methodologies

5.5 Results and Discussion

5.6 Conclusion

References

6 Genomic Data Analysis With Optimized Convolutional Neural Network (CNN) for Edge Applications

6.1 Introduction

6.2 Related Work

6.3 Proposed Methodology

6.4 Conclusion

References

7 Real-World Estimation of Malaria Prevalence From Genome of Vectors and Climate Analysis

7.1 Introduction

7.2 Significance of Estimation of Malaria Prevalence From Genome of Vectors and Climate Analysis

7.3 Proposed Methodology

7.4 Conclusion

References

8 Revolutionalizing Internet of Medical Things for Blockchain-Based 5G Healthcare Security and Privacy for Genomic Data

8.1 Introduction

8.2 Internet of Things in Healthcare and Medical Applications (IoMT)

8.3 Limitations of Existing Technologies in Healthcare Applications

8.4 Solving Healthcare Problems Using BCT

8.5 Proposed Model for 5G Healthcare

8.6 Experimental Setup and Discussion

8.7 Results

8.8 Conclusion

References

9 Preserve Privacy-HD: A Privacy-Preserving Distributed Framework for Health Data

9.1 Introduction

9.2 Organization of Chapter

9.3 Related Work

9.4 Proposed Methodology

9.5 Complexity Analysis

9.6 Conclusion

References

Part 3: Future-Based Applications

10 Decision and Recommendation System Services for Patient Using Artificial Intelligence

10.1 Introduction

10.2 Literature Review

10.3 Proposed Methodology

10.4 Implementation and Results Analysis

10.5 Results and Discussion

10.6 Conclusion

References

11 MPHDRDNN: Meticulous Presaging of Heart Disease by Regularized DNN Through GUI

11.1 Introduction

11.2 Literature Study

11.3 Proposed Methodology

11.4 Performance Evaluation Metrics

11.5 Results and Implementation Process for Building a GUI

11.6 Conclusion and Future Scope

Acknowledgment

References

12 Techniques for Removing Hair From Dermoscopic Images: A Survey of Current Approaches

12.1 Introduction

12.2 Inpainting Techniques Used for Hair Removal From Dermoscopic Images

12.3 Hair Removal Approaches From Dermoscopic Images

12.4 Results and Discussion

12.5 Conclusion

Acknowledgment

References

13 The Emergence of Blockchain Technology in Industrial Revolution 5.0

13.1 Introduction

13.2 Literature Survey

13.3 Evolution of Web Technology in Association With Blockchain

13.4 Understanding of Basic Key Terminologies

13.5 Industrial Components Associated With Blockchain

13.6 Contribution of Blockchain for Revolutionizing Industrial Aspects

13.7 Transformation of Industrial Sectors by Blockchain

13.8 Economical Impact Analysis of Blockchain Over Each Industry

13.9 Conclusion

References

14 Cervical Cancer Detection Using Big Data Analytics and Their Comparative Analysis

14.1 Introduction

14.2 Related Literature

14.3 Machine Learning Algorithms Used

14.4 Description of the Dataset

14.5 Parameter Specification on the ANNs

14.6 Analysis of Results

14.7 Conclusions

References

15 Smart Walking Stick for Visually Impaired People

15.1 Introduction

15.2 Related Work

15.3 System Architecture

15.4 Conclusion

References

Part 4: Issues and Challenges

16 Enhanced Security Measures in Genomic Data Management

16.1 Introduction

16.2 Literature Survey

16.3 Background

16.4 Proposed Model

16.5 Analysis of the Work

16.6 Future Work

16.7 Conclusion

References

17 Industry 5.0: Potentials, Issues, Opportunities, and Challenges for Society 5.0

17.1 Introduction

17.2 Literature Review

17.3 Role of Robots in Industry 5.0 and Society 5.0

17.4 Potentials of Industry 5.0 and Society 5.0

17.5 Open Issues Toward Industry 5.0 and Society 5.0

17.6 Opportunities and Challenges Toward Industry 5.0 and Society 5.0

17.7 Conclusion

References

18 Artificial Intelligence—Blockchain Enabled Technology for Internet of Things: Research Statements, Open Issues, and Possible Applications in the Near Future

18.1 Introduction: Artificial Intelligence, Machine Learning, Internet of Things, and Blockchain Concepts

18.2 Frameworks, Architectures, and Models for the Convergence of Machine Learning, IoTs, and Blockchain Technologies

18.3 Machine Learning Techniques for the Optimization of IoT-Based Services

18.4 Machine Learning Techniques for Exchanging Data in a Blockchain

18.5 Machine Learning-Based Blockchain Transactions

18.6 IoT-Enabled Security Using Artificial Intelligence and Blockchain Technologies

18.7 Security, Privacy, and Trust Related Areas in Artificial Intelligence and Blockchain-Based IoT Applications

18.8 Blockchain-Based Learning Automated Analytics Platforms

18.9 Blockchain- and Machine Learning-Based Solutions for Big Data Challenges

18.10 Machine Learning Techniques for the Analysis of Sensor Records for Healthcare Applications

18.11 Blockchain-Enabled IoT Platforms for Automation in Intelligent Transportation Systems

18.12 Conclusion

References

19 Blockchain-Empowered Decentralized Applications: Current Trends and Challenges

19.1 Introduction

19.2 Literature Survey

19.3 Key Characteristics of Blockchain

19.4 Challenges Faced by Blockchain Technology

19.5 The Use of Smart Contracts in Decentralized Autonomous Organizations

19.6 Smart Contracts: An Overview

19.7 Smart Contract Platforms

19.8 Applications of Blockchain Technology

19.9 Future Data Storage Issues in Blockchain and Its Solution

19.10 Conclusion

References

20 Privacy of Data, Privacy Laws, and Privacy by Design

20.1 Introduction

20.2 Privacy Issues

20.3 Privacy Laws and Regulatıons

20.4 Techniques for Enforcing Privacy

20.5 Privacy by Design

20.6 Conclusion

References

Index

End User License Agreement

List of Tables

Chapter 4

Table 4.1 Characteristics and relative performance by a previous method on COV...

Chapter 7

Table 7.1 Classifiers for the given problem statement.

Table 7.2 Nonlinear regression for the given problem statement.

Chapter 9

Table 9.1 Existing works.

Chapter 10

Table 10.1 Description of one-hot encoding transform.

Table 10.2 Examined datasets.

Table 10.3 Comparison of LSTM over SVM and CNN.

Table 10.4 Comparisons of all metrics.

Chapter 11

Table 11.1 Nodes presents in DNN.

Table 11.2 Accuracy results.

Table 11.3 Comparison of the Reg-DNN with existing systems.

Chapter 12

Table 12.1 Comparison of filtering-based techniques for hair removal in dermos...

Table 12.2 Comparison of morphological techniques for hair removal in dermosco...

Table 12.3 An overview of edge-based approaches for hair removal in dermoscopy...

Table 12.4 Comparison of supervised techniques for hair removal in dermoscopy.

Table 12.5 An overview of various hair removal techniques in dermoscopy.

Chapter 13

Table 13.1 Summarizing literature work.

Table 13.2 Comparison of functionality and mining processes in different conse...

Chapter 14

Table 14.1 Dataset attributes.

Table 14.2 Mean/mode/median values for various data with data splitting of 80...

Table 14.3 Mean/mode/median values for various data with data splitting of 70:...

Table 14.4 Mean/mode/median values for various data with data splitting of 50:...

Chapter 16

Table 16.1 Symbolisations employed in the work.

Table 16.2 Procedure to make hash key.

Table 16.3 Parameters used in the work.

Chapter 17

Table 17.1 Industry 5.0 challenges and future directions.

Chapter 18

Table 18.1 Blockchain-internet of thing: merits and security issues.

Table 18.2 Blockchain-enabled IoV.

Chapter 20

Table 20.1 Example database.

Table 20.2 Sample database with sensitive and non-sensitive data.

Table 20.3 Table showing total result.

Table 20.4 Table showing counts.

List of Illustrations

Chapter 1

Figure 1.5.1 Signs and symptoms of Klinefelter’s syndrome [30].

Figure 1.5.2 Classification of polycystic ovarian syndrome (PCOS) based differ...

Chapter 2

Figure 2.1 Insight into DNA: gene and protein.

Figure 2.2 Major application arenas of genomics.

Figure 2.3 Process of gene therapy to cure the diseases.

Figure 2.4 Mapping of Human Gene Project (HGP).

Chapter 3

Figure 3.1 Internet of medical things in medical engineering.

Figure 3.2 Application of blockchain in healthcare.

Figure 3.3 Telemedicine applications.

Figure 3.4 Various AI technologies in healthcare.

Chapter 4

Figure 4.1 Graphical representation of the human genome.

Figure 4.2 A conceptual overview of the usages of genomics in healthcare.

Figure 4.3 A conceptual overview of the replication process.

Figure 4.4 A brief overview of the development of genomics.

Figure 4.5 Graphical representation of gene data using

Homo sapiens

genome.

Figure 4.6 A conceptual overview of the types of genomics.

Figure 4.7 Structural genomics overview.

Figure 4.8 A conceptual overview of function genomics.

Figure 4.9 A conceptual overview of metagenomics.

Figure 4.10 A conceptual overview of epigenomics.

Figure 4.11 Primary application areas of genomics.

Figure 4.12 Essential usage of biotechnology.

Figure 4.13 Using genomics for social science application.

Figure 4.14 A conceptual overview of incorporating genomic information into Me...

Figure 4.15 Application areas of DNA computing.

Figure 4.16 Nextstrain SARS-CoV-2 biological connections.

Figure 4.17 Nextstrain SARS-CoV-2 biological clade variants.

Figure 4.18 Relationship of artificial intelligence with genomics and genetics...

Chapter 5

Figure 5.1 Single nucleotide polymorphism.

Chapter 6

Figure 6.1 Flow of CNN analytics.

Figure 6.2 Genomic data analysis with CNN.

Figure 6.3 Multivariate splits of DNA pair sequencing.

Chapter 7

Figure 7.1 Number of rainy days in each month.

Figure 7.2 Amount of precipitation in a month.

Figure 7.3 Average temperature in a month.

Figure 7.4 Percentage of humidity per month.

Figure 7.5 Number of malaria occurrences per month.

Chapter 8

Figure 8.1 Applications of blockchain technology.

Figure 8.2 Architecture of blockchain technology.

Figure 8.3 (a) Data tampering attack. (b) Server and client-based blockchain t...

Figure 8.4 Initial screen of the proposed model.

Figure 8.5 Setting up path.

Figure 8.6 Hash code generated for human genome data.

Figure 8.7 Initializing 5G server.

Figure 8.8 Setting up of input data.

Figure 8.9 Final results.

Chapter 9

Figure 9.1 Problem in medical data sharing.

Figure 9.2 Secure weight sharing.

Figure 9.3 Proposed model.

Figure 9.4 Parameter server logic.

Figure 9.5 Synchronous data parallel model.

Chapter 10

Figure 10.1 Proposal methodology for our work.

Figure 10.2 Description of Modbus format.

Figure 10.3 LSTM detection model.

Figure 10.4 Performance analysis of web attack detection.

Chapter 11

Figure 11.1 DNN neuron architecture.

Figure 11.2 Proposed workflow.

Figure 11.3 Regularized capacity. https://medium.com/analytics-vidhya/regulari...

Figure 11.4 Implementation results for loss, dropout, loss.

Figure 11.5 GUI interface for heart disease prediction.

Chapter 12

Figure 12.1 (a) Original image, (b) corresponding grey image, (c) output image...

Figure 12.2 (a) Original image, (b) output images after hair removal technique...

Chapter 13

Figure 13.1 Financial transactions stored in blockchain.

Figure 13.2 Difference between centralized and decentralized application.

Figure 13.3 Application of smart contract in real estate dealing.

Figure 13.4 Application of DLT and smart contract in real estate.

Figure 13.5 Application of distributed ledger and smart contract in supply cha...

Figure 13.6 Major division of O&G Industry.

Figure 13.7 Use case of blockchain in O&G Upstream Industry.

Figure 13.8 Percentage of total expenditure spend over blockchain by each sect...

Chapter 14

Figure 14.1 Female reproductive system.

Figure 14.2 Architectural structure of the MLP.

Figure 14.3 Pseudocode for MLP implementation.

Figure 14.4 Architectural structure of the backpropagation.

Figure 14.5 Pseudocode for Backpropagation algorithm implementation.

Figure 14.6 Architectural structure of RNN.

Figure 14.7 Pseudocode for RNN implementation.

Figure 14.8 Target values analysis pie chart.

Figure 14.9 Scatter matrix and area plot of the targets.

Figure 14.10 Histogram of the age attribute.

Figure 14.11 Confusion matrix.

Figure 14.12 Model accuracy for biopsy BPN.

Figure 14.13 Confusion matrix for Hinselmann BPN.

Figure 14.14 Model accuracy for Hinselmann BPN.

Figure 14.15 Confusion matrix for citology BPN.

Figure 14.16 Model loss for citology BPN.

Figure 14.17 Confusion matrix for Schiller BPN.

Figure 14.18 Model loss for Schiller BPN.

Figure 14.19 Model accuracy for Hinselmann MLP.

Figure 14.20 Confusion matrix for Hinselmann MLP.

Figure 14.21 Accuracy graph for citology MLP.

Figure 14.22 Confusion matrix for citology results MLP.

Figure 14.23 Model accuracy for biopsy MLP.

Figure 14.24 Confusion matrix for biopsy results MLP.

Figure 14.25 Model accuracy for Schiller MLP.

Figure 14.26 Confusion matrix for Schiller results MLP.

Figure 14.27 Model accuracy for Hinselmann RNN.

Figure 14.28 Confusion matrix for Hinselmann RNN results.

Figure 14.29 Model accuracy for citology RNN.

Figure 14.30 Confusion matrix for citology RNN results.

Figure 14.31 Model accuracy for biopsy RNN.

Figure 14.32 Confusion matrix for biopsy RNN results.

Figure 14.33 Model accuracy for Schiller RNN.

Figure 14.34 Confusion matrix for the RNN Schiller results.

Chapter 15

Figure 15.1 Implementation block diagram.

Figure 15.2 Model of the smart walking stick.

Figure 15.3 System architecture of the software application.

Chapter 16

Figure 16.1 Summary of the genomic-data-based well-being administration [6].

Figure 16.2 E-health framework in stockpile [17].

Figure 16.3 IoHT architecture classification [18].

Figure 16.4 Overlay network [20].

Figure 16.5 System architecture [25].

Figure 16.6 Biomedical security system with blockchain [26].

Figure 16.7 Overview of a patient’s body with IoT medical sensors [27].

Figure 16.8 Framework of suggested system [28].

Figure 16.9 Smart healthcare system framework [29].

Figure 16.10 Merkle tree authentication for data [32].

Figure 16.11 System model [7].

Figure 16.12 Security analysis.

Chapter 17

Figure 17.1 Key-enabling technologies of Industry 5.0.

Figure 17.2 Evaluation history of Industry 5.0.

Figure 17.3 Challenges and future directions toward Industry 5.0.

Chapter 18

Figure 18.1 Artificial intelligence (AI) framework.

Figure 18.2 Internet of Things (IoT) components.

Figure 18.3 Machine learning adoption in blockchain.

Figure 18.4 Internet of Things and Artificial Intelligence as together.

Figure 18.5 Future envision on battlefield using artificial intelligence and m...

Chapter 19

Figure 19.1 Blockchain validations on artificial intelligence (AI) for retail.

Guide

Cover

Table of Contents

Series Page

Title Page

Preface

Acknowledgements

Begin Reading

Index

End User License Agreement

Pages

iii

xix

xxi

100

101

102

103

104

105

106

107

108

109

110

111

112

113

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

Scrivener Publishing100 Cummings Center, Suite 541JBeverly, MA 01915-6106

Publishers at ScrivenerMartin Scrivener ([email protected])Phillip Carmical ([email protected])

Privacy Preservation of Genomic and Medical Data

Edited by

Amit Kumar Tyagi

National Institute of Fashion Technology, New Delhi, India

This edition first published 2024 by John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA and Scrivener Publishing LLC, 100 Cummings Center, Suite 541J, Beverly, MA 01915, USA© 2024 Scrivener Publishing LLCFor more information about Scrivener publications please visit www.scrivenerpublishing.com.

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, except as permitted by law. Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions.

Wiley Global Headquarters111 River Street, Hoboken, NJ 07030, USA

For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com.

Limit of Liability/Disclaimer of WarrantyWhile the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives, written sales materials, or promotional statements for this work. The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make. This work is sold with the understanding that the publisher is not engaged in rendering professional services. The advice and strategies contained herein may not be suitable for your situation. You should consult with a specialist where appropriate. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read.

Library of Congress Cataloging-in-Publication Data

ISBN 978-1-394-21262-0

Cover image: Pixabay.ComCover design by Russell Richardson

Preface

Data science is a broad field encompassing some of the fastest-growing subjects in interdisciplinary statistics, mathematics and computer science. It encompasses a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decision making. Data analysis has multiple facets and approaches, including diverse techniques under a variety of names, in different business, science, and social science domains. Similarly, data analytics is now required in the medial field for analyzing genomic and genetic data.

Genomics is a branch of genetics coined by Tom Roderick in 1986. Genetics is the study of a single gene, whereas genomics refers to the study of a group of genes called genomes. Genomes can be considered an instruction manual for human life. Originally the analysis of genomic data was very costly. However, due to the advancements in technology, the sequencing cost has come down significantly, so that genomic analysis can even be included in daily medical routines. The more we explore our genomes, the easier it will be to make medical decisions and cure diseases.

Genomic data does not only include personal information, but also the family ancestors’ data. Any leakage of this type of information could cause very serious issues, so data-protection is critical. Privacy laws such as GINA (Genetic Information Non-discrimination Act), HIPAA (Health Insurance Portability and Accountability Act of 1996) and GDPR (General Data Protection Regulation) help users protect their privacy by restricting the sharing of patients’ sensitive information. However, we must focus on privacy issues in an era of such rapid developments in the healthcare sectors. The main categories of privacy in healthcare include data, location, identity, and genomic privacy. Existing tools are insufficient to handle genomic data because of the large size of the datasets.

This book focuses on genomic data sources, analytical tools, and the importance of privacy preservation. Topics discussed include tensor flow and Bio-Weka, privacy laws, HIPAA, and other emerging technologies like Internet of Things, IoT-based cloud environments, cloud computing, edge computing, and blockchain technology for smart applications. The book starts with a basic introduction to genomes, genomics, genetics, transcriptomes, proteomes, and other basic concepts of modern molecular biology. It concludes with future predictions for genomic and genomic privacy, emerging technologies, and applications.

Amit Kumar Tyagi

Acknowledgements

First, we extend our gratitude to our family members, friends, and supervisors who stood by us as advisors during the completion of this book. Also, we thank our almighty God who inspired us to write this book. Furthermore, we thank Wiley and Scrivener Publishing, who have provided continuous support; and our colleagues with whom we have worked inside the college and university system, as well as those outside of academia who have provided their endless support toward completing this book.

Finally, we wish to thank our Respected Madam, Prof. G Aghila, Prof. Siva Sathya, our Respected Sir Prof. N Sreenath, and Prof. Aswani Kumar Cherukuri, for their valuable input and help in completing this book.

Amit Kumar Tyagi

Part 1FUNDAMENTALS

1Introduction to Genomics and Genetics

Mahreen Fatima1, Sana Zia2, Maheen Murtaza3, Asyia Shafique4, Afshan Muneer3, Junaid Sattar5, Muhammad Ashir Nabeel6 and Amjad Islam Aqib7*

1Faculty of Biosciences, Cholistan University of Veterinary and Animal Sciences, Bahawalpur, Pakistan

2Department of Zoology, Government Sadiq College Women, University, Bahawalpur, Pakistan

3Department of Zoology, Cholistan University of Veterinary and Animal Sciences Bahawalpur, Pakistan

4Department of Clinical Medicine and Surgery, University of Agriculture, Faisalabad, Pakistan

5Faculty of Veterinary Sciences, Choliatan University of Veterinary and Animal Sciences, Bahawalpur, Pakistan

6Animal Sciences, University of Illinois Urbana Champaign, Urbana, United States of America

7Department of Medicine, Cholistan University of Veterinary and Animal Sciences, Bahawalpur, Pakistan

Abstract

Genomic research is a relatively new field in biotechnology, with DNA sequencing as its essential technology. Genomic research is progressing quickly due to the accessibility of advanced technologies, which enables genome-wide sequencing to address biological questions. During the last decade, genomic studies have evolved into potential tools for understanding human disease genetics. It was essential to organize a sequence of 3 billion letter codes in a cost-effective manner after the evolution of the human project. By producing large amounts of sequencing data at a low cost, this breakthrough enabled the emergence of a wide variety of biomedical applications after the completion of the project. For the interpretation of the human genome, these technological advancements have enabled the sequencing of various vertebrate genomes. In addition to allowing the study of vertebrate genome evolution, this sequencing will also benefit human medicine and comparative genomics. The focus of this chapter is to introduce and review the basic aspects of genomics, as well as its role in the pharmaceutical industry.

Keywords: Genomic, genetic, technology, biomedical application, pharmaceutical industry

1.1 Introduction

There are slightly more than 20,000 human protein-coding genes, but every one of these classifications typically codes for numerous proteins thanks to mechanisms like uncommon concerning, differing strand transcripts, and others. There may be up to five determined transcripts per gene sequence, giving some confirmation. Two percent part of the human genome’s DNA balance contains the real protein-encoding orders. It is now generally acknowledged that there are tens of thousands of genomic areas that encode “noncoding RNA transcripts.” These RNAs display a role in the control of messenger RNA translation and gene entrance (mRNAs) [1]. Given how the chromatin state moves gene appearance, it is strong that epigenomic cause changes to histone chemistry and DNA methylation levels can have a significant impact on transcription. Once more, this is a cutting-edge field of cellular biology where the potential position of activity and inactivity needs to be widely explored. If genomics is the study of the assets of the genome, genetics is the study of how characters or phenotypes are approved down through groups [2]. The identification of genetic differences related to neuropsychiatric disorders and treatment significances has thus amplified self-assurance that these findings will rapidly be functional in the clinic to improve diagnosis, disease risk forecast, and patient reply to drug therapy. Slower DNA segments can be sequenced using the shotgun method, clone by clone, whole genome, Maxam Gilbert, and Sanger sequencing methods. Sanger chemistry, the “original” sequencing method, reads through a DNA template shaped during DNA synthesis using specially branded nucleotides [3]. The Sanger method is used read 1000 to 1200 base pairs (bp) thanks to a number of practical developments, but it is still imperfect to the 2-kilo base pair (kbps) [4]. In this book chapter focus, the hub of genomics, Genome Sequencing Methods, Variation of Genome Sequencing, Diseases and Disorders, and Future Prospects.

1.2 Hub of Genomics

There are slightly more than 20,000 human protein-coding genes. Still, every one of these classifications typically codes for numerous proteins thanks to mechanisms like uncommon concerning, differing strand transcripts, and others. There may be up to five determined transcripts per gene sequence, giving some confirmation. Two percent of the human genome’s DNA balance contains the real protein-encoding orders. It is now generally acknowledged that tens of thousands of genomic areas encode “noncoding RNA transcripts.” These RNAs display a role in controlling messenger RNA translation and gene entrance (mRNAs) [5]. Given how the chromatin state moves gene appearance, it is strong that epigenomic cause changes to histone chemistry and DNA methylation levels can significantly impact transcription. Once more, this is a cutting-edge field of cellular biology where the potential position of activity and inactivity needs to be widely explored. If genomics is the study of the assets of the genome, genetics is the study of how characters or phenotypes are approved down through groups [6]. Identifying genetic differences related to neuropsychiatric disorders and treatment significance has thus amplified self-assurance that these findings will rapidly be functional in the clinic to improve diagnosis, disease risk forecast, and patient reply to drug therapy. Slower DNA segments can be sequenced using the shotgun method, clone by clone, whole genome, Maxam Gilbert, and Sanger sequencing methods. Sanger chemistry, the “original” sequencing method, reads through a DNA template shaped during DNA synthesis using specially branded nucleotides [7]. The Sanger method is used 1000 to 1200 base pairs (bp) thanks to some practical developments, but it is still imperfect to the 2 kbps [8]. The chapter aims to review some hubs of genomics, genome sequencing methods, variations of genome sequencing, diseases and disorders, and prospects.

A. Phenotype

The hypothetical molecular phenotypes comprise organism-level phenotypes like diseases and some molecular variations. For example, these include the over- or under-expression of specific genes [9]. So, one of the first steps is classifying the molecular phenotypes that go sideways with them. In the past 10 years, gene appearance has advanced a molecular-level trait used as a molecular phenotype to classify diseases, classify drug targets, and infer gene-gene exchanges [10].

The consequences are typically more consistent and humbler when gene appearance variations are carefully examined under various circumstances. Furthermore, it can be difficult to identify a gene’s function outside its context when its function is obscure [11]. The amplified statistical power of a module-based method also makes it credible to identify a disturbed module even when individual genes within the module may not have suffered statistically significant perturbation [12].

B. Genotype

The sympathetic of phenotype models—whose fluctuations are linked in expression with changes in phenotype—was enclosed in the section before this one. This section focuses on the gene foundations of suffering and the substitute networks that these supports define. According to new studies, genomic changes vary greatly in complex diseases like cancer and neurological disorders. Theoretically, the changed or mutant genes may be part of the same pathways, collectively dysregulating those pathways. For example, O’Roak et al. [30] discovered that 39% (49 of 126) of the most severe or disordered de novo mutations map to a highly organized network of the proteins—catenin and chromatin remodeling in sporadic autism [33]. These strategies focus on identifying genotypic modules, or subnetworks, that are enriched with genes having genetic changes linked to disease. This mindset is common in the research of somatic cell mutations in cancer, which are the primary causes of the disease [13].

C. Gene Environmental Interaction

The environment may assist (or prevent) prospects for the entrance of genotype-guided social penchants in the most open and perhaps most overall sense. For case, some examples of dormant variable G-E interactions previously mentioned imply this. There, the inspiration of religion, public features (such as urban/rural), peer substance use, or parental guideline on smoking or drinking varied. Furthermore, it has been recognized that some of these environmental factors can decrease the social properties of specific polymorphisms [14]. Other environmental issues that play a vital role in G-E interactions include shocking events like child exploitation that cause penetrating delicate and physical responses. New data proposes that these replies affect the organic paths that collaborate with genetic effects, maybe uniformly varying how the genes are expressed. These properties can last for the mainstream or all of a person’s lifespan due to specific genomic variations, or they can be passed and linked with environmental contact [15].

D. Epigenetics

DNA sequence differences are not compound in epigenetic changes. Although, throughout mitosis, they regularly pass from one cell to its daughter cells. It has also been documented that some epigenetic changes can be approved down the germ lines from one group to the next. It is extensively accepted that exposure to nutrients, cellular insults, stress, etc., can lead to epigenetic alterations at any age [16].

A chromosome’s double-stranded DNA is packed very firmly, as previously stated. Each nucleosome is related to the one after it by a piece of free DNA that has been thankful by the linker histone, H1. The histone of nucleosomes can experience a diversity of chemical alterations, such as acetylation, methylation, and phosphorylation. The patterns of histone modifications can affect how chromatin is arranged, potentially affecting transcription activity. Methylating a cytosine base in the 5′ positions can also change DNA [15].

1.3 Genome Sequencing Methods

A. Clone by Clone

Smaller pieces of the genome must be duplicated and put into bacteria using this technique. The deliberate genome’s desirable 150,000 base pairs can be invented in identical duplicates, or “clones,” created by growing the bacteria. The injected DNA is then further fragmented into 500 base pair portions that overlay in each clone. These more flattened supplements have a sequence. After sequencing, the overlying sections are used to put the clone back calmly. Using Sanger sequencing, this method was used to order the first human genome. While it takes a lot of time and money, this method works [17].

B. Maxam-Gilbert Sequencing

The Maxam-Gilbert method, this is genuine by subcloning. The five consecutive steps—base-specific alteration, resting the reactions, ethanol rainfall and centrifugation in arrangement, piperidine cleavage, and recurrent reduced-pressure evaporation—are exceptionally laborious, though. The option of losing the DNA pellet during the ethanol precipitation and centrifuging is additional issue. Additionally, during the ethanol precipitation, hydrazine might co-precipitate with DNA and affect the reactions’ specificity [18].

Advantages and Disadvantages

The DNA template utilized in the Maxim-Gilbert sequencing method can be single-stranded or double-stranded, which is its chief draw. At one point, the Maxam-Gilbert approach was selected over the Sanger method since the later wanted to clone the single-stranded DNA for each read start. Moreover, DNA protein connections and epigenetic DNA alterations can be studied using the Maxam-Gilbert approach. Maxam-Gilbert sequencing was primarily forced by the use of dangerous substances and methods like radiolabeling and X-rays. The method was disapproved since it obligatory the use of hydrazine, a known neurotoxin, and was hard to scale up and manage [1].

C. Sanger Sequencing Method

The Sanger technique for genetic or DNA sequencing has been found main request in the field of veterinary diagnostics.

Outdated Sanger sequencing lasts to be the most popular sequencing approach utilized in VDLs for sequence verification, assay watching, and as the foundation for many phylogenetic analyses. It not only helps as the basis for fresher and automatic procedures. In this method, complementary DNA that had previously been prolonged by the DNA polymerase enzyme using any combination of the four deoxynucleotide triphosphates (dNTPs: dATP, dGTP, dCTP, and dTTP) or chain-terminating dideoxynucleotide triphosphates is annealed using an oligonucleotide primer. When rate-limiting amounts of the ddNTPs are added, the elongation synthesis becomes stationary and produces distinct DNA fragments of different lengths. Sanger sequencing technology can currently produce widely used NA sequences up to 800 to 1,000 bp. The method’s most significant disadvantages characteristically include primer binding-induced lowermost quality sequence from 15 to 40 bp and the incapability to notice single base pair variations in longer segments. Seeing these limits, both commercial and open-source sequence analysis software is emerging to contribution users in automatically classifying and eliminating low-quality data. The capillary gel electrophoresis method used in Sanger sequencing is the focus of the VDL rules described here [19].

Advantages and Disadvantages

Researchers were able to find mutations and the fundamental cause of genetic illnesses with the aid of Sanger sequencing. It is the most actual technique for locating brief tandem repeats and sequencing a single gene. However, this method’s major drawback is the length of time it takes, which results in a low throughput. Short DNA sequences (between 300 and 1000 base pairs) can only be processed using this method one at a time [19].

Principle of Genome Sequencing and Assembly

Currently, a shotgun sequencing approach is used for the majority of genome projects. Genomic DNA is first cut into a variety of tiny, random fragments. These are independently sequenced to a specific length, depending on the technology. The resulting sequence reads are then put back together into longer uninterrupted sequence stretches (contigs) using potent computer algorithms, a procedure known as de novo assembly. High sequencing coverage is necessary to ensure that the sequence reads at each position in the genome overlap sufficiently for proper assembly (or read depth). Naturally, more overlap can be anticipated for longer sequence reads, which lowers the necessary raw read depth. Longer fragments (a few hundred base pairs) are typically sequenced from both ends (paired-end sequencing) to provide more details on where the reads should be placed in the assembled sequence [20]. The library’s anticipated fragment length tells us how far apart the two contigs are physically, and the blank space is filled with the meaningless base-pair character “N.” The missing base-pair information is filled in by later gap-closing techniques, ideally using long reads that read across repetitive sequences. The final step frequently involves joining the scaffolds into linkage groups or putting them on chromosomes. Unquestionably, the best method for arranging and orienting scaffolds into longer sequence blocks is to create genetic maps from crosses or pedigree data [21].

1.4 Variation of Genome Sequencing

A. Single Nucleotide Polymorphism

In single nucleotide polymorphisms, a specific nucleotide site differs from others [22]. DNA molecules may have different nucleotide pairings at the same sites in different populations, such as a T-A base pair at one nucleotide site and a C-G base pair at another. Statistically, significant differences are referred to as SNPs. There are two alleles associated with the SNP, for which the population may have three genotypes: homozygous or heterozygous chromosomes, T-A on one chromosome, and C-G on the homologous chromosome [23]. Alleles are enclosed in quotation marks because they do not need to be in coding arrangements or even genes [24]. Genetic variations that affect less than 1% of DNA molecules in a population are excluded from the SNP definition because they differ at a nucleotide site. A genetic variant that occurs too rarely in a population is not as useful for genetic analysis as a variant that occurs more frequently in a population [25].

B. INDELs (Insertion and Deletion)

SNPs, on the other hand, have received extensive research compared to other forms of genetic variation in humans [36]. The discovery of INDELs has lagged far behind the discovery of SNPs, and only a small number of them have been found. INDELs can be classified into five major categories: (1) insertions and deletions of a single base pair, (2) monomeric expansions of two to five base pairs, (3) multiple expansions of a repeat unit between two and fifteen bases, (4) transposon insertions and (5) INDELs with random DNA sequences [23].

C. Copy Number Variation

Genetic association studies often evaluate SNPs, which are differences between individuals of the same species at a particular genomic location. Copy number variations (CNVs) in DNA sequences are common in naturally occurring organisms and have functional significance, their full significance has not yet been fully understood. Recombination and replication processes, as well as a higher rate of de novo locus-specific mutations than SNPs, all contribute to the production of CNVs. Through chromosomal segment deletions and duplications, Hastings et al. [13] described the mechanisms of change that result in CNV evolution in humans.

1.5 Diseases and Disorders

A. Multiple Sclerosis

A non-traumatic disabling disease most commonly affecting young adults is multiple sclerosis (MS). Both developed and developing countries are experiencing an increase in MS incidence and prevalence, with no clear cause identified. Ascherio A [26] describes MS as a complex disease caused by many genes, as well as several well-described environmental factors, including vitamin D exposure. In recent years, however, B-cell targeted therapies have challenged the conventional dogma about T-cell autoimmune disease. Traditionally, MS has been considered a two-stage disease involving relapsing–remitting disease and secondary or primary progressive disease caused by delayed neurodegeneration [27]. In addition to providing neural insulation and saltatory conduction of neurodegeneration and conduction failure. Several genes have been identified as responsible for CMT, which is a form of inherited peripheral neuropathy. A component of the age-related progression of MS can be explained by comorbid diseases, such as smoking and vascular disease. The comorbid disease makes patients more likely to progress more rapidly [28].

B. Klinefelter Syndrome

A common sex chromosome disorder and genetic irregularity, (KS) are chromosome disorders [29]. The traditional phenotype of the syndrome, defined as tall stature, small testes, gynecomastia, gynoid hips, sparse body hair, primary hypogonadism, mental retardation, and sparse body hair, has been shown to be rarely observed in the clinical setting (refer Figure 1.5.1).

Figure 1.5.1 Signs and symptoms of Klinefelter’s syndrome [30].

It may be due to a combination of the severity of genetic defects that the final KS phenotype develops. There are a variety of phenotypes, ranging from hypogonadotropic hypogonadism, infertility, neurocognitive deficits, as well as severe comorbidities [30].

C. Down Syndrome

Around 1 in every 600 to 700 live births has Down syndrome (DS) or trisomy 21, which reasons to basic bright incapability. Chromosome 21 trisomy is the reason for DS, which is produced by a total or partial trisomy of the autosomal chromosome 21. DS population life expectation has increased evocatively over the past numerous periods. In difference to other populations, individuals with DS still experience higher mortality rates. Characteristics present in all DS populations are craniofacial irregularities, and hypotonia in early beginning, there are numerous other preserved [31]. A DS baby have a big toe, irregular fingerprint pattern, and short fingers. Moreover, Robertson Ian translocations and is chromosomal or ring chromosomal disorders can cause the condition. In the process of developing an egg sperm, an chromosome refers to two long arms of the chromosome unraveling collected rather than the long arm and the short arm unravelling together. A mosaic is a mis division of cells after fertilization that happens at a sure point during cell division. In mosaic DS, two lineages of cells donate to tissues and organs; one lineage has usual chromosomes, while the other has 21 extra chromosomes [32].

D. Polycystic Ovarian Syndrome

There is an extremely high incidence and prevalence of polycystic ovary syndrome. Women in early to late-night reproductive phases universally are pretentious by it, but its prevalence varies extensively according to race and society. For example, South Asians are more likely to hurt from it than Caucasians. Asian women have a higher polycystic ovarian syndrome (PCOS) incidence (52% versus 20%–25%) than Western Caucasian women. As informed by the World Health Organization (WHO), PCOS affected 116 million women globally in 2012 (4%–12%) and was predictable to reach 26% by 2020 [33]. Rather than being a disease, PCOS is a disorder characterized by engorged ovaries and cysts more than ten in number. It is supposed that these cysts are the residues of follicles that have not advanced. An increase in thickening of the ovary wall happens as the disorder develops, and checks ripened follicles from being free. ANOVA, infertility, and disturbances in the menstrual cycle are all considered symptoms of PCOS [34].

PCOS produced by insulin confrontation (refer Figure 1.5.2): (1) High insulin levels are the important cause of PCOS. (2) Adrenal PCOS: This condition is shaped by adrenal excretions being inspired during puberty; patients with this disorder usually experience more stress due to extra levels of DHEAS. (3) Inflammatory PCOS: PCOS patients tend to have smaller chronic inflammation. (4) Post-pill PCOS: e.g., as a result of hormonal inequalities and contraceptive pills. Presently, PCOS can be achieved with Allopathic therapy, herbal therapy, lifestyle variations, and dietary alterations [35].

Figure 1.5.2 Classification of polycystic ovarian syndrome (PCOS) based different factors [35].

1.6 Future Prospects

One reference genome cannot accurately reflect the diversity of human genetics, given the complexity of genetic variation. Normal genomes from diverse human populations, but especially those of African origin, who have the most genetic variety, must be sequenced and assembled [36].

Several new reference genomes are being produced with the help of long-read DNA sequencing in combination with short-read error correction. According to our current discovery rate, sequencing 300 human genomes in this manner will lead to a double number of structural variants that have been identified (at the DNA sequence level), identifying, in theory, the majority of common structural variants [37]. In millions of Illumina genomes that have already been generated, structural variation sequence resolution allows better genotyping of such alleles, allowing the discovery of new relationships [38].

A number of pathogenic variations are located outside of coding regions, which can have a negative effect on gene expression and translation. Despite the fact that exome sequencing is less expensive and provides greater sample sizes and power, it provides little information on regulatory modifications and hampers the discovery of minor structural variants even within coding sequences [39]. It is challenging to identify harmful variants linked with noncoding mutations. It may, however, be possible to understand noncoding regulatory mutations and their contribution to common and rare genetic diseases if structural variants are detected on a systematic basis as they are more likely than SNVs to be dysfunctional and alter gene expression [40]. In order to detect structural variations, whole genome sequencing is crucial. Fully phased long-read genome sequences are expected to yield 2.8 times as many structural variations as Illumina whole-genome sequencing, plus a 30% boost over long-read callers that do not phase. In other words, rather than a 3-Gbps genome, we should consider a 6-Gbps genome, where both parental haplotypes are perfectly sequenced and assembled [41].

It is essential to understand the genetic basis of de novo mutations, DNA sequence conservation between species, and selection at a given locus before one can interpret variants. Methods that utilize these factors have been successful in identifying potential pathogenic variants by adopting specific methods [42]. They require, however, that variation both within and between species be equally described. Non-human primate, mammalian, and vertebrate genomes must be sequenced with the same rigor as extrahuman genomes because orthologous DNA sequences are historically poorly aligned and have rates of mutation that differ by orders of magnitude [43].

Conventional human genome analysis utilizing short-read sequencing data catches only around 85% of the genome and ignores some of the most variable areas, excluding them from association testing [44, 45]. To describe all human chromosomes, including acrocentric, telomeric, centromeric, and segmentally duplicated DNA, from telomere to telomere is a simple goal. Sequencing platforms for long reads and ultralong reads [46].

1.7 Conclusion

This chapter gives a detailed review on genomics and genetics. This chapter describes the perpetuity of genomics to genetics in the practice of treatment diagnosis of disease. In the future, by using genetic information, a disease can be diagnosed at very early stages, and it can also be cured. Researchers need to become very active and savvy in interpreting genetic information that could be used for in medical and public health. Researcher, professional academia, and organization should communicate collectively to conduct the necessary research.

References

1. Jain, M., Olsen, H.E., Turner, D.J., Stoddart, D., Bulazel, K.V., Paten, B., Miga, K.H., Linear assembly of a human centromere on the Y chromosome.

Nat. Biotechnol.

, 36, 4, 321–323, 2018.

2. Vollger, M.R., Dishuck, P.C., Sorensen, M., Welch, A.E., Dang, V., Dougherty, M.L., Eichler, E.E., Long-read sequence and assembly of segmental duplications.

Nat. Methods

, 16, 1, 88–94, 2019.

3. Kronenberg, Z.N., Fiddes, I.T., Gordon, D., Murali, S., Cantsilieris, S., Meyerson, O.S., Eichler, E.E., High-resolution comparative analysis of great ape genomes.

Science

, 360, 6393, eaar6343, 2018.

4. Kircher, M., Witten, D.M., Jain, P., O’roak, B.J., Cooper, G.M., Shendure, J., A general framework for estimating the relative pathogenicity of human genetic variants.

Nat. Genet.

, 46, 3, 310–315, 2014.

5. Ebert, P., Audano, P.A., Zhu, Q., Rodriguez-Martin, B., Porubsky, D., Bonder, M.J., Eichler, E.E., Haplotype-resolved diverse human genomes and integrated analysis of structural variation.

Science

, 372, 6537, eabf7117, 2021.

6. Koren, S., Rhie, A., Walenz, B.P., Dilthey, A.T., Bickhart, D.M., Kingan, S.B., Phillippy, A.M., De novo assembly of haplotype-resolved genomes with trio binning.

Nat. Biotechnol.

, 36, 12, 1174–1182, 2018.

7. Turner, T.N., Coe, B.P., Dickel, D.E., Hoekzema, K., Nelson, B.J., Zody, M.C., Eichler, E.E., Genomic patterns of de novo mutation in simplex autism.

Cell

, 171, 3, 710–722, 2017.

8. Audano, P.A., Sulovari, A., Graves-Lindsay, T.A., Cantsilieris, S., Sorensen, M., Welch, A.E., Eichler, E.E., Characterizing the major structural variant alleles of the human genome.

Cell

, 176, 3, 663–675, 2019.

9. McClellan, J.M., Lehner, T., King, M.C., Gene discovery for complex traits: Lessons from Africa.

Cell

, 171, 2, 261–264, 2017.

10. Huddleston, J., Chaisson, M.J., Steinberg, K.M., Warren, W., Hoekzema, K., Gordon, D., Eichler, E.E., Discovery and genotyping of structural variation from long-read haploid genome sequence data.

Genome Res.

, 27, 5, 677–685, 2017.

11. Conrad, D.F., Pinto, D., Redon, R., Feuk, L., Gokcumen, O., Zhang, Y., Hurles, M.E., Origins and functional impact of copy number variation in the human genome.

Nature

, 464, 7289, 704–712, 2010.

12. Stankiewicz, P. and Lupski, J.R., Structural variation in the human genome and its role in disease.

Annu. Rev. Med.

, 61, 437–455, 2010.

13. Hastings, P.J., Lupski, J.R., Rosenberg, S.M., Ira, G., Mechanisms of change in gene copy number.

Nat. Rev. Genet.

, 10, 8, 551–564, 2009.

14. Belsky, D.W., Moffitt, T.E., Corcoran, D.L., Domingue, B., Harrington, H., Hogan, S., Caspi, A., The genetics of success: How single-nucleotide polymorphisms associated with educational attainment relate to life-course development.

Psychol. Sci.

, 27, 7, 957–972, 2016.

15. Belsky, D.W., Moffitt, T.E., Corcoran, D.L., Domingue, B., Harrington, H., Hogan, S., Caspi, A., The genetics of success: How single-nucleotide polymorphisms associated with educational attainment relate to life-course development.

Psychol. Sci.

, 27, 7, 957–972, 2016.

16. Michailidou, K., Lindström, S., Dennis, J., Beesley, J., Hui, S., Kar, S., Humphreys, K., Association analysis identifies 65 new breast cancer risk loci.

Nature

, 551, 7678, 92–94, 2017.

17. Berger, J., Suzuki, T., Senti, K.A., Stubbs, J., Schaffner, G., Dickson, B.J., Genetic mapping with SNP markers in Drosophila.

Nat. Gen.

, 29, 4, 475– 481, 2001.

18. Wicks, S.R., Yeh, R.T., Gish, W.R., Waterston, R.H., Plasterk, R.H., Rapid gene mapping in Caenorhabditis elegans using a high density polymorphism map.

Nat. Gen.

, 28, 2, 160–164, 2001.

19. Dawson, E., Chen, Y., Hunt, S., Smink, L.J., Hunt, A., Rice, K., Dunham, I., A SNP resource for human chromosome 22: Extracting dense clusters of SNPs from the genomic sequence.

Genome Res.

, 11, 1, 170–178, 2001.

20. Hendriksen, R.S., Bortolaia, V., Tate, H., Tyson, G.H., Aarestrup, F.M., McDermott, P.F., Using genomics to track global antimicrobial resistance.

Front. Public Health

, 7, 242, 2019.

21. Malla, M.A., Dubey, A., Kumar, A., Yadav, S., Hashem, A., Abd_Allah, E.F., Exploring the human microbiome: The potential future role of next-generation sequencing in disease diagnosis and treatment.

Front. Immunol.

, 9, 2868, 2019.

22. Shokralla, S., Spall, J.L., Gibson, J.F., Hajibabaei, M., Next-generation sequencing technologies for environmental DNA research.

Mol. Ecol.

, 21, 8, 1794–1805, 2012.

23. Mahmoud, A.M., An overview of epigenetics in obesity: The role of lifestyle and therapeutic interventions.

Int. J. Mol. Sci.

, 23, 3, 1341, 2022.

24. Agustí, A., Melén, E., DeMeo, D.L., Breyer-Kohansal, R., Faner, R., Pathogenesis of chronic obstructive pulmonary disease: Understanding the contributions of gene–environment interactions across the lifespan.

Lancet Respir. Med.

, 10, 512–524, 2022.

25. Chaste, P. and Leboyer, M., Autism risk factors: Genes, environment, and gene-environment interactions.

Dialogues Clin. Neurosci.

, 14, 281–292, 2022.

26. Ascherio, A., Environmental factors in multiple sclerosis.

Expert Review of Neurotherapeutics

, 13(sup2), 3–9, 2013

27. Zahid, G., Aka Kaçar, Y., Dönmez, D., Küden, A., Giordani, T., Perspectives and recent progress of genome-wide association studies (GWAS) in fruits.

Mol. Biol. Rep.

, 49, 6, 5341–5352, 2022.

28. McArthur, E., Rinker, D.C., Gilbertson, E.N., Fudenberg, G., Pittman, M., Keough, ... K., Capra, J.A.,

Reconstructing the 3D genome organization of Neanderthals reveals that chromatin folding shaped phenotypic and sequence divergence

, Biorxiv, 2022, 2022-02.

29. Pudlo, N.A., Urs, K., Crawford, R., Pirani, A., Atherly, T., Jimenez, R., Martens, E.C., Phenotypic and genomic diversification in complex carbohydrate-degrading human gut bacteria.

Msystems

, 7, 1, e00947–21, 2022.

30. O’Roak, B.J., Vives, L., Girirajan, S., Karakoc, E., Krumm, N., Coe, B.P., Eichler, E.E., Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations.

Nature

, 485, 7397, 246–250, 2012.

31. Hendriksen, R.S., Bortolaia, V., Tate, H., Tyson, G.H., Aarestrup, F.M., McDermott, P.F., Using genomics to track global antimicrobial resistance.

Front. Public Health

, 7, 242, 2019.

32. Koren, S., Rhie, A., Walenz, B.P.

et al.

, De novo assembly of haplotyperesolved ge nomes with trio binning.

Nat. Biotechnol.

, 36, 12, 1174–1182, 2018.

33. Gilman, S.R., Iossifov, I., Levy, D., Ronemus, M., Wigler, M., Vitkup, D., Rare de novo variants associated with autism implicate a large functional network of genes involved in formation and function of synapses.

Neuron

, 70, 5, 898– 907, 2011.

34. Greenfield, A.L. and Hauser, S.L., B-cell therapy for multiple sclerosis: Entering an era.

Ann. Neurol.

, 83, 1, 13–26, 2018.

35. Coles, A.J., Cox, A., Le Page, E.

et al.

, The window of therapeutic opportunity in multiple sclerosis: Evidence from monoclonal antibody therapy.

J. Neurol.

, 253, 98–108, 2006.

36. Handel, A.E., Williamson, A.J., Disanto, G., Dobson, R., Giovannoni, G., Ramagopalan, S.V., Smoking and multiple sclerosis: An updated meta-analysis.

PLoS One

, 6, 1, e16149, 2011.

37. Gravholt, C.H., Chang, S., Wallentin, M., Fedder, J., Moore, P., Skakkebaek, A., Klinefelter syndrome - integrating genetics, neuropsychology and endocrinology.

Endocr. Rev.

, 39, 389– 423, 2018.

38. Bonomi, M., Rochira, V., Pasquali, D., Balercia, G., Jannini, E.A., Ferlin, A., Klinefelter syndrome (KS): Genetics, clinical phenotype and hypogonadism.

J. Endocrinol. Invest.

, 40, 123–134, 2017.

39. De Graaf, G., Buckley, F., Skotko, B.G., Estimation of the number of people with down syndrome in the United States.

Genet. Med.

, 19, 4, 439–447, 2017.

40. Ellegren, H., Smeds, L., Burri, R., Olason, P.I., Backström, N., Kawakami, T., Wolf, J.B., The genomic landscape of species divergence in Ficedula flycatchers.

Nature

, 491, 7426, 756–760, 2012.

41. Ellegren, H., Genome sequencing and population genomics in non-model organisms.

Trends Ecol. Evol.

, 29, 1, 51–63, 2014.

42. Correale, J. and Gaitan, M.I., Multiple sclerosis and environmental factors: The role of vitamin D, parasites, and Epstein–barr virus infection.

Acta Neurol. Scand.

, 132, 46–55, 2015.

43. Santos, F., Gómez-Manzo, S., Sierra-Palacios, E., González-Valdez, A., Castillo-Villanueva, A., Reyes-Vivas, H., Marcial-Quino, J., Purification, concentration and recovery of small fragments of DNA from Giardia lamblia and their use for other molecular techniques.

MethodsX

, 4, 289–296, 2017.

44. Vandin, F., Upfal, E., Raphael, B.J., Algorithms for detecting significantly mutated pathways in cancer.

J. Comput. Biol.

, 18, 507–522, 2011.

45. Crossley, B.M., Bai, J., Glaser, A., Maes, R., Porter, E., Killian, M.L., Clement, T., Toohey-Kurth, K., Guidelines for sanger sequencing and molecular assay monitoring.

J. Vet. Diagn. Invest.: Off. Publ. Am. Assoc. Veterinary Lab. Diagnosticians Inc.

, 32, 6, 767–775, 2020.

https://doi.org/10.1177/1040638720905833

46. Hoehe, M.R. and Morris-Rosendahl, D.J., The role of genetics and genomics in clinical psychiatry.

Dialogues Clin. Neurosci.

, 20, 169–177, 2022.