98,99 €
The only book available in the area of forward-time population genetics simulations--applicable to both biomedical and evolutionary studies The rapid increase of the power of personal computers has led to the use of serious forward-time simulation programs in genetic studies. Forward-Time Population Genetics Simulations presents both new and commonly used methods, and introduces simuPOP, a powerful and flexible new program that can be used to simulate arbitrary evolutionary processes with unique features like customized chromosome types, arbitrary nonrandom mating schemes, virtual subpopulations, information fields, and Python operators. The book begins with an overview of important concepts and models, then goes on to show how simuPOP can simulate a number of standard population genetics models--with the goal of demonstrating the impact of genetic factors such as mutation, selection, and recombination on standard Wright-Fisher models. The rest of the book is devoted to applications of forward-time simulations in various research topics. Forward-Time Population Genetics Simulations includes: * An overview of currently available forward-time simulation methods, their advantages, and shortcomings * An overview and evaluation of currently available software * A simuPOP tutorial * Applications in population genetics * Applications in genetic epidemiology, statistical genetics, and mapping complex human diseases The only book of its kind in the field today, Forward-Time Population Genetics Simulations will appeal to researchers and students of population and statistical genetics.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 330
Veröffentlichungsjahr: 2012
Contents
Cover
Title Page
Copyright
Preface
Acknowledgments
List of Examples
Chapter 1: Basic Concepts and Models
1.1 Biological and genetic concepts
1.2 Population and evolutionary genetics
1.3 Statistical genetics and genetic epidemiology
References
Chapter 2: Simulation of Population Genetics Models
2.1 Random genetic drift
2.2 Demographic models
2.3 Mutation
2.4 Migration
2.5 Recombination and Linkage Disequilibrium
2.6 Natural selection
2.7 Genealogy of forward-time simulations
References
Chapter 3: Ascertainment Bias in Population Genetics
3.1 Introduction
3.2 Methods
3.3 Results
3.4 Discussion and Conclusions
References
Chapter 4: Observing Properties of Evolving Populations
Introduction
4.2 Simulation of the Evolution of Allele Spectra
4.3 Extensions to the Basic Model
References
Chapter 5: Simulating Populations with Complex Human Diseases
5.1 Introduction
5.2 Controlling disease allele frequencies at the present generation
5.3 Forward-time simulation of realistic samples
5.4 Discussion
References
Chapter 6: Nonrandom Mating and its Applications
6.1 Assortative mating
6.2 More complex nonrandom mating schemes
6.3 Heterogeneous mating schemes
6.4 Simulation of age-structured populations
References
Appendix: Forward-Time Simulations Using SimuPOP
A.1 Introduction
A.2 Population
A.3 Operators
A.4 Evolving one or more populations
A.5 A complete simuPOP script
References
Index
Copyright 2012 by Wiley-Blackwell. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created ore extended by sales representatives or written sales materials. The advice and strategies contained herin may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services please contact our Customer Care Department with the U.S. at 877-762-2974, outside the U.S. at 317-572-3993 or fax 317-572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print, however, may not be available in electronic format.
Library of Congress Cataloging-in-Publication Data:
Peng, Bo, 1974-
Forward-time population genetics simulations: methods, implementation, and applications / Bo Peng, Marek Kimmel, Christopher I. Amos.
p.; cm.
Includes bibliographical references and index.
ISBN 978-0-470-50348-5 (pbk.)
I. Kimmel, Marek, 1959- II. Amos, Christopher I. III. Title.
[DNLM: 1. Genetics, Population. 2. Biological Evolution. 3. Computer Simulation. 4. Models, Genetic. QU 450]
LC-classification not assigned
576.5′8–dc23
2011033593
To Zheng, Benjamin, William, and Elena
Preface
Forward-time population genetics simulation is simple in concept. Given a population with individuals of certain genotype, we evolve the population generation by generation, subject to various demographic and genetic forces such as population size change, mutation, selection, recombination, and migration. Population properties such as allele frequencies can be observed dynamically or be studied at the end of the simulation. Because this process mimics fundamental ways the human populations evolve, it is not surprising that such simulations have been used for decades and played an important role in the development and application of population and evolutionary genetics. However, due to the overwhelming demand for computing power in realistic population simulations, the applications of this simulation method have largely been limited to the development and demonstration of theoretical population genetics principles.
Recent years have witnessed a renewed attention to this old subject. Rapid developments in both methodology and software development have made forward-time population genetics simulation a promising tool to study complex evolutionary histories of different types of populations, with novel applications in the areas of population and evolutionary genetics, statistical genetics, genetic epidemiology, and even conservation biology. The revival of this simulation method can be largely contributed to two forces. The first is a strong need for highly flexible simulation method to simulate and study complex evolutionary histories. Although a large number of specialized xiii methods are available, none of them is as flexible as forward-time simulations because forward-time simulations follow the direction at which populations evolve and can, at least in principle, simulate arbitrarily complex evolutionary scenarios. The second driven force is the continuous increase of the power of personal computers, which makes it possible to simulate millions of individuals for extended generations in a reasonable amount of time.
The fundamental advantage of forward-time simulations over other simulation methods is flexibility. Because this method is not restricted by any assumption, it can be used to simulate arbitrary complex evolutionary scenarios. However, despite the availability of a large number of simulation programs, very few of them can harness the full power of this simulation method. A typical forward-time simulation program is designed to simulate particular evolutionary processes for particular types of studies. Users are usually allowed to choose from a number of stocked genetic models and their parameters, but are not allowed to define their own evolutionary processes. For example, none of the existing programs can be used to study the evolution of a disease predisposing mutant, a process that is of great importance in statistical genetics and genetic epidemiology. Researchers who work on novel evolutionary models or new application areas without existing software are usually forced to write their own software.
The implementation of simuPOP was motivated by the studies of the evolutionary history of complex human diseases. Instead of a special-purpose program written for a few publications, this program was designed from ground up to be a general-purpose population genetics simulation program that can be used to simulate arbitrary evolutionary processes. Using a scripting language design, users of simuPOP could make use of many of its unique features, such as customized chromosome types, arbitrary nonrandom mating schemes, virtual subpopulations, information fields, and Python operators to construct and study almost arbitrarily complex evolutionary scenarios. This unique design makes simuPOP the best and in many aspects the only software packages for the simulation of complex evolutionary scenarios. Although some evolutionary scenarios could be simulated using other software packages, this book uses simuPOP to simulate all examples and lists source code of most examples so that users can learn how to implement various evolutionary scenarios and write their own simulations based on these examples. Note that although we describe most major features of simuPOP in the appendix of this book, this book is not a complete reference to simuPOP. Readers who would like to write complex scripts in simuPOP should refer to the simuPOP user's guide and reference manual for details.
Chapter 1 of this book gives an overview of important concepts and models that will be used in this book. Because of the mere number of concepts and models involved, they are introduced in a brief and often casual way. Interested readers should refer to standard textbooks on these subjects for more in-depth descriptions.
Chapter 2 simulates a number of standard population genetics models using a forward-time approach. The goal of these simulations is to demonstrate the impact of genetic factors such as mutation, selection, and recombination on standard Wright–Fisher models and how to use simuPOP to simulate them. Because detailed descriptions of these models are widely available in textbooks such as Principles of Population Genetics [1], we describe these models and their theoretical properties briefly, only as a way to motivate our simulations. Although simulations in this chapter are confirmatory in nature, they could be used to set up more complex evolutionary scenarios in which more than one genetic factor would be applied.
The rest of this book is devoted to applications of forward-time simulations in various research topics. Each chapter starts with a short description of the research topic and why forward-time simulations are used. The simulation processes are then described in detail. Because the primary focus of this book is on simulation techniques and not on particular research topics, we will present and discuss the results of these simulations briefly, leaving in-depth discussions to published papers on these topics. The simuPOP scripts that are used to perform all simulations are listed in the last sections of these chapters. Readers who are not interested in implementation details can safely skip these sections.
With continued increase of the power of personal computers and the availability of a powerful and flexible simulation engine, a wide range of interesting research topics could be attacked by forward-time population genetics simulations. We hope that this book can help researchers who are interested in such simulation design and implement their own simulations. We would welcome any comments and discussions and would appreciate the readers who would alert us to any errors they discover in this book.
Bo Peng
Houston, Texas
2011
Acknowledgments
The work covered in this book, especially the design and implementation of simuPOP, was done when the first author was a PhD student in the Department of Statistics at Rice University and a postdoctoral fellow in the Department of Epidemiology at the University of Texas, M. D. Anderson Cancer Center. The helpful and supportive comments of faculty and fellow students of the departments are hereby acknowledged.
A number of colleagues and students have helped in the development of simuPOP and in the writing of this book in various ways. Yaji Xu, a graduate research assistant, spent a lot of time on the documentation of simuPOP. His hard work during the summer of 2007 resulted in the first simuPOP release (0.8.0) that has a comprehensive online help system and a complete reference manual. Biao Li, a doctoral candidate in the Department of Bioengineering at Rice University, has helped in the development of allele frequency trajectory simulation functions and pedigree-related features of simuPOP and has written and executed some of the simulations for this book, especially the ones for Chapter 3. He also helped with the preparation of the bibliography and many figures of the book. Jianzhong Ma, PhD, read through the draft of this book and provided many useful suggestions. A high school student, Blake Kushwaha, helped proofread this book. They all deserve our sincere appreciation.
Numerous technical problems were encountered during the design and xvii implementation of simuPOP and we relied on various online forums for help. We would especially like to thank the Python and SWIG (Simplified Wrapper and Interface Generator, http://www.swig.org) user community, whose prompt replies to many e-mails were essential to the implementation of simuPOP.
User involvement was modest until early 2007, but has since then driven the development of simuPOP. Questions, bug reports, and feature requests from users have greatly enhanced the reliability and usability of this program and have led to the addition of many important features such as information fields and virtual subpopulations. One of the users, Tiago Antão, deserves a special thanks for his many bug reports and his contribution to the simuPOP online cookbook.
The development of a large software application such as simuPOP required a huge amount of time, many of which had to be drawn from time I should have spent with my wife Zheng Meng and our three children Benjamin, William and Elena. Their support during the past several years allowed me to pursue a career that I really enjoy, but has required many extra hours under the moonlight. I would like to dedicate this book to them.
Part of Bo Peng's research was supported by a training fellowship from the W.M. Keck Foundation to the Gulf Coast Consortia through the Keck Center for Computational and Structural Biology, and a Cancer Prevention Fellowship provided by the Jerry and Maury Rubenstein Foundation through the University of Texas, M.D. Anderson Cancer Center. Related research activities for all authors were partly supported by grant CA75432 from the National Cancer Institute, by grants ES09912 and R01CA133996-01 from the National Institutes of Health, and by grant 3T11F 01029 from Komitet Bada Naukowych (Polish Research Committee). Most of the simulations were performed using the Rice Terascale Cluster, funded by the National Science Foundation under grant EIA-0216467, by Intel, and by HP, and using the High Performance Cluster at the M.D. Anderson Cancer Center.
Bo Peng
List of Examples
2.1Decay of Homozygosity Due to Random Genetic Drift2.2Absorption Time and Time to Fixation2.3Demonstration of a Bottleneck Effect2.4Diallelic Mutation Model2.5k-Allele and Stepwise Mutation Models2.6An Island Model of Migration2.7Recombination Between Three Loci2.8Single-Locus Diallelic Selection Models2.9A Two-Locus Symmetric Viability Model of Natural Selection2.10Number of Ancestors of a Haploid Simulation2.11Number of Ancestors of a Diploid Simulation3.1Script to Simulate the Evolution of Microsatellite Marker Using a Scaling Technique4.1A Demographic Model with Population Split and Rapid Population Expansion4.2A Python Class That Defines Instant and Exponential Population Expansion Models4.3A Python Function to Calculate Effective Number of Alleles4.4A Self-Defined Operator to Calculate Effective Number of Alleles4.5Simulation of Multiple Independent Single-Locus Selection Models4.6Evolve a Population Subject to Mutation and Selection5.1Straightforward Simulation of the Introduction of a Disease Allele5.2Reintroduction of a Disease Allele When It Is Lost5.3Simulating Allele Frequency Trajectory Forward in Time5.4Simulating Allele Frequency Trajectory Backward in Time5.5Using a Controlled Mating Scheme with a Backward-Simulated Trajectory5.6Simulation of Populations with Realistic Pattern of Linkage Disequilibrium5.7A Single-Locus Penetrance Model of Breast Cancer5.8Draw Case–Control Samples5.9Generating Case–Control Samples5.10Generating Affected Sibpair Samples5.11Simulation of a Disease Model with G × G and G × E Interactions5.12Generation of Trio Samples from Simulated Population6.1A Quantitative Trait Model6.2A Sequential Selfing Mating Scheme6.3An Example of Assortative Mating6.4Simulation of Mating Behaviors of Pilot Whales6.5A Mating Scheme with Continuous Habitat6.6Use of a Heterogeneous Mating scheme to Simulate Partial Self-Fertilization6.7Simulating an Admixed Population with Recorded Ancestral Values6.8Example of the Evolution of Age-Structured Population6.9Implementation of the Lung cancer Disease Model6.10Evolution of Lung CancerA.1A Simple ExampleA.2Access Genotype Structure of a PopulationA.3Define and Use Virtual SubpopulationsA.4Access to Individuals in a PopulationA.5Use of Population Modification and Batch Access FunctionsA.6Keeping Multiple Ancestral Generations During an Evolutionary ProcessA.7Recording Parentship of Individuals During EvolutionA.8Applicable Generations of an OperatorA.9Use of Parameter Output of Operators to Redirect Operator OutputA.10Use of an inheritTagger to Track Individual AncestryA.11An Asymmetric Stepwise Mutation Model with Random StepsA.12Use of a Python Operator to Draw Sample at Every 100 GenerationsA.13Control of the Number and Sex of Offspring in a Monogamous Mating SchemeA.14Use a Terminator to Terminate an Evolutionary Process ConditionallyA.15Evolve Several Replicates of a Population SimultaneouslyA.16A Sample simuPOP ScriptChapter 1
Basic concepts and models
The simulation approaches that are described in this book involve knowledge from several disciplines. First, the genes and genomes are the targets of simulations, so some understanding of biology and genetics is needed. Then, the simulations involve the evolution of a collection of individuals over a long period of time, and we are concerned with the dynamics of the properties of the whole population rather than with a small number of individuals. This involves knowledge of population and evolutionary genetics. Finally, as the most important application area, we will simulate the evolution of human diseases and produce populations with affected individuals. Techniques from statistical genetics and genetic epidemiology will be used to locate genes that are responsible for the diseases.
This chapter reviews basic concepts and, more importantly, various mathematical models that will be used in this book, organized by disciplines. To target the most essential components, these concepts are often defined in a casual way that may not reflect their full biological or statistical complexity. For more in-depth descriptions and concrete examples, the reader should refer to standard textbooks on these topics [1–4]. Readers who are already familiar with one or more of the disciplines can skip relevant sections.
1.1 Biological and genetic concepts
1.1.1 Genome and Chromosomes
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
Lesen Sie weiter in der vollständigen Ausgabe!
