Statistical Diagnostics for Cancer -  - E-Book

Statistical Diagnostics for Cancer E-Book

0,0
99,99 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

This ready reference discusses different methods for statistically analyzing and validating data created with high-throughput methods. As opposed to other titles, this book focusses on systems approaches, meaning that no single gene or protein forms the basis of the analysis but rather a more or less complex biological network. From a methodological point of view, the well balanced contributions describe a variety of modern supervised and unsupervised statistical methods applied to various large-scale datasets from genomics and genetics experiments. Furthermore, since the availability of sufficient computer power in recent years has shifted attention from parametric to nonparametric methods, the methods presented here make use of such computer-intensive approaches as Bootstrap, Markov Chain Monte Carlo or general resampling methods. Finally, due to the large amount of information available in public databases, a chapter on Bayesian methods is included, which also provides a systematic means to integrate this information. A welcome guide for mathematicians and the medical and basic research communities.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 527

Veröffentlichungsjahr: 2012

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Contents

Cover

Titles of the Series: “Quantitative and Network Biology”

Related Titles

Title Page

Copyright

Preface

References

List of Contributors

Part One: General Overview

Chapter 1: Control of Type I Error Rates for Oncology Biomarker Discovery with High-Throughput Platforms

1.1 Brief Summary

1.2 Introduction

1.3 High-Throughput Platforms

1.4 Analysis of Experiments

1.5 Multiple Testing Type I Errors

1.6 Discussion

1.7 Perspective

References

Chapter 2: Overview of Public Cancer Databases, Resources, and Visualization Tools

2.1 Brief Overview

2.2 Introduction

2.3 Different Cancer Types are Genetically Related

2.4 Incidence and Mortality Rates of Cancer

2.5 Cancer and Disorder Databases

2.6 Visualization and Network-Based Analysis Tools

2.7 Conclusions

2.8 Perspective

References

Part Two: Bayesian Methods

Chapter 3: Discovery of Expression Signatures in Chronic Myeloid Leukemia by Bayesian Model Averaging

3.1 Brief Introduction

3.2 Chronic Myeloid Leukemia (CML)

3.3 Variable Selection on Gene Expression Data

3.4 Bayesian Model Averaging (BMA)

3.5 Case Study: CML Progression Data

3.6 The Power of iBMA

3.7 Laboratory Validation

3.8 Conclusions

3.9 Perspective

3.10 Publicly Available Resources

Acknowledgments

References

Chapter 4: Bayesian Ranking and Selection Methods in Microarray Studies

4.1 Brief Summary

4.2 Introduction

4.3 Hierarchical Mixture Modeling and Empirical Bayes Estimation

4.4 Ranking and Selection Methods

4.5 Simulations

4.6 Application

4.7 Concluding Remarks

4.8 Perspective

4.9 Appendix: The EM Algorithm

References

Chapter 5: Multiclass Classification via Bayesian Variable Selection with Gene Expression Data

5.1 Brief Summary

5.2 Introduction

5.3 Matrix Variate Distribution

5.4 Method

5.5 Real Data Analysis

5.6 Discussion

5.7 Perspective

References

Chapter 6: Semisupervised Methods for Analyzing High-dimensional Genomic Data

6.1 Brief Summary

6.2 Motivation

6.3 Existing Approaches

6.4 Data Application: Mesothelioma Cancer Data Set

6.5 Perspective

References

Part Three: Network-Based Approaches

Chapter 7: Colorectal Cancer and Its Molecular Subsystems: Construction, Interpretation, and Validation

7.1 Brief Summary

7.2 Colon Cancer: Etiology

7.3 Colon Cancer: Development

7.4 The Pathway Paradigm

7.5 Cancer Subtypes and Therapies

7.6 Molecular Subsystems: Introduction

7.7 Molecular Subsystems: Construction

7.8 Molecular Subsystems: Interpretation

7.9 Molecular Subsystems: Validation

7.10 Worked Example: Label-Free Proteomics

7.11 Conclusions

7.12 Perspective

References

Chapter 8: Network Medicine: Disease Genes in Molecular Networks

8.1 Brief Summary

8.2 Introduction

8.3 Genetic Architecture of Human Diseases

8.4 Systems Properties of Disease Genes

8.5 Disease Gene Prioritization

8.6 Conclusion

8.7 Perspectives

8.8 Acknowledgments

References

Chapter 9: Inference of Gene Regulatory Networks in Breast and Ovarian Cancer by Integrating Different Genomic Data

9.1 Brief Summary

9.2 Introduction

9.3 Theory and Contents of Gene Regulatory Network

9.4 Inference of Gene Regulatory Networks in Human Cancer

9.5 Conclusions

9.6 Perspective

References

Chapter 10: Network-Module-Based Approaches in Cancer Data Analysis

10.1 Brief Summary

10.2 Introduction

10.3 Notation and Terminology

10.4 Network Modules Containing Functionally Similar Genes or Proteins

10.5 Network Module Searching Methods

10.6 Applications of Network-Module-Based Approaches in Cancer Studies

10.7 The Reactome FI Cytoscape Plug-in

10.8 Conclusions

10.9 Perspective

References

Chapter 11: Discriminant and Network Analysis to Study Origin of Cancer

11.1 Brief Summary

11.2 Introduction

11.3 Overview of Relevant Machine Learning Techniques

11.4 Methods

11.5 Experiments and Results

11.6 Conclusion

11.7 Perspective

References

Chapter 12: Intervention and Control of Gene Regulatory Networks: Theoretical Framework and Application to Human Melanoma Gene Regulation

12.1 Brief Summary

12.2 Gene Regulatory Network Models

12.3 Intervention in Gene Regulatory Networks

12.4 Optimal Perturbation Control of Gene Regulatory Networks

12.5 Human Melanoma Gene Regulatory Network

12.6 Perspective

References

Part Four: Phenotype Influence of DNA Copy Number Aberrations

Chapter 13: Identification of Recurrent DNA Copy Number Aberrations in Tumors

13.1 Introduction

13.2 Genetic Background

13.3 Analyzing DNA Copy Number: Single Sample Methods

13.4 Analyzing DNA Copy Number Data: Multiple Sample Methods to Detect Recurrent CNAs

13.5 Analyzing DNA Copy Number Data with DiNAMIC

13.6 Open Questions

References

Chapter 14: The Cancer Cell, Its Entropy, and High-Dimensional Molecular Data

14.1 Brief Summary

14.2 Introduction

14.3 Background

14.4 Entropy Increase

14.5 Statistical Arguments

14.6 Statistical Methodology

14.7 Simulation

14.8 Application to Cancer Data

14.9 Conclusion

14.10 Perspective

14.11 Software

Acknowledgment

References

Index

Titles of the Series

“Quantitative and Network Biology”

Volume 1

Dehmer, M., Emmert-Streib, F., Graber, A., Salvador, A. (eds.)

Applied Statistics for Network Biology

Methods in Systems Biology

2011

ISBN: 978-3-527-32750-8

Volume 2

Dehmer, M., Varmuza, K., Bonchev, D.(eds.)

Statistical Modelling of Molecular Descriptors in QSAR/QSPR

2012

ISBN: 978-3-527-32434-7

Related Titles

Zhou, X.-H., Obuchowski, N. A., McClish, D. K.

Statistical Methods in Diagnostic Medicine

2011

ISBN: 978-0-470-18314-4

Azuaje, F.

Bioinformatics and Biomarker Discovery

“Omic” Data Analysis for Personalized Medicine

2010

ISBN: 978-0-470-74460-4

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty can be created or extended by sales representatives or written sales materials. The Advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

Library of Congress Card No.: applied for

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.

Bibliographic information published by the Deutsche Nationalbibliothek

The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.d-nb.de.

©2013 Wiley-VCH Verlag GmbH & Co. KGaA, Boschstr. 12, 69469 Weinheim, Germany

All rights reserved (including those of translation into other languages). No part of this book may be reproduced in any form – by photoprinting, microfilm, or any other means – nor transmitted or translated into a machine language without written permission from the publishers. Registered names, trademarks, etc. used in this book, even when not specifically marked as such, are not to be considered unprotected by law.

Print ISBN: 978-3-527-33262-5

ePDF ISBN: 978-3-527-66544-0

ePub ISBN: 978-3-527-66545-7

mobi ISBN: 978-3-527-66546-4

oBook ISBN: 978-3-527-66547-1

Typesetting Thomson Digital, Noida, India

Cover Design Grafik-Design Schulz, Fußgönheim

Preface

The data revolution in biology and medicine provides not only opportunities in enhancing our fundamental understanding of biological processes, patho- and tumorigenesis, and epidemiology but also constitutes a considerable challenge toward their analysis. For this reason, novel statistical and computational approaches are required to unravel the mass data provided by contemporary sequencing and array technologies [4, 5].

The aim of the book Statistical Diagnostics for Cancer: Analyzing High-Dimensional Data is to present statistical methods focusing on a systems level that can be applied to a wide spectrum of genetics and genomics data from high-throughput experiments of cancer. Due to the breathtaking progress during the last years in biology, many experimental approaches that originated in molecular and cell biology are now at the verge to enter medical research. For this reason, the major goal of the present book is to advocate and promote novel analysis methods that hold great promise to be beneficial for prognostic and diagnostic purposes in biomedical research.

Along the way toward this goal, we are facing several principle problems which need to be addressed systematically [1, 2, 3]. In this respect, three problems are of particular importance. First, in contrast to traditional clinical data, data from high-throughput experiments are very high dimensional involving thousands or even tens of thousands of variables. Usually, this requires a dimension reduction or a variable selection to tame the associated computational complexity of such high-dimensional problems. Second, due to the molecular dependence of gene products on each other there is a nonnegligible heterogeneity in these data-possessing difficulties for parametric statistical models. For this reason, nonparametric methods, for example, bootstrap or resampling methods are frequently used. Third, it is more and more common that high-throughput data from different technologies are simultaneously available, which requires their meaningful integration.

According to the World Health Organization (WHO), cancer is one of the leading causes of death in the developed countries. For this reason, we are focusing in this book entirely on this menace to the health and the chapters are discussing a large variety of different methods applied to different cancer types. For example, investigations of breast cancer, cervical cancer, colorectal cancer, lung cancer, leukemia, lymphoma, melanoma, ovarian cancer, and prostate cancer are presented in a way that highlights the obtained genetic and molecular understanding of these complex diseases, but provide also a thorough explanation of the statistical methods.

This book is intended for researches, graduate, and advanced undergraduate students in the interdisciplinary fields of computational biology, biostatistics, bioinformatics, and systems biology studying problems in biological and biomedical sciences. Each chapter is comprehensively presented, accessible not only to researchers from this field but also to interested students or scientists specialized in related areas. To enable this, each chapter presents not only technical results but provides in addition background knowledge which is necessary to understand the statistical method or the biological problem under consideration. In addition, each chapter starts with a section called “Brief Summary” and finishes with a section “Perspective.” These sections are nontechnical in nature providing the reader with a brief overview of the presented topic. These features allow us to use this book as a textbook for, for example, an interdisciplinary seminar for advanced students.

In Figure 1, we show an overview of all chapters in this book. Due to the complexity of general approaches to cancer, it is not possible to categorize the chapters uniquely by just one keyword. For this reason, we provide in Figure 1 a three-dimensional categorization, which is based on (1) the used data types, (2) statistical and computational methods, and (3) the studied cancer types. For each of these conceptual categories, we use a color code, as provided in Figure 1. In addition, the book is organized in four parts. In the first part, chapters present a general overview of generic methods and data used in the remainder of the book. The second part focuses on Bayesian methods and the third part on network-based approaches. Finally, part four contributed chapters describing the influence of DNA copy number abberrations on the phenotype. This conceptual overview may be useful for the reader to find quickly a specific chapter that deals with a particular subset of cancer types or statistical methods.

Figure 1 Brief overview of the book chapters with respect to the three major conceptual topics: high-throughput data types (a), statistical and computational methods (b), and cancer types (c).

Many colleagues, whether consciously or unconsciously, have provided us with input, help, and support before and during the preparation of the present book. In particular, we would like to thank Andreas Albrecht, Gökmen Altay, Subhash Basak, Jaine Blayney, Danail Bonchev, Frederick Campbell, Aedin Culhane, Maria Duca, Dean Fennell, Galina Glazko, Armin Graber, Beryl Graham, Benjamin Haibe-Kains, Peter Hamilton, Des Higgins, Maria Hughes, Patrick Johnston, Frank Kee, Declan Kieran, Chang Sik Kim, Terry Lappin, Kang Li, D. D. Lozovanu, Florian Markowetz, Darragh McArt, Dennis McCance, James McCann, Abbe Mowshowitz, Ken Mills, Paul Mullan, Arcady Mushegian, Katie Orr, Andrei Perjan, John Quackenbush, Andre Ribeiro, Bert Rima, Sudhakar Sahoo, Ricardo de Matos Simoes, Francesca Shearer, John Story, Simon Tavaré, Shailesh Tripathi, Peter Valent, Kurt Varmuza, Yinhai Wang, Kathleen Williamson, Shu-Dong Zhang, and apologize to all who have not been named mistakenly. We would also like to thank our editors Andreas Sendtko and Gregor Cicchetti from Wiley-Blackwell who have been always available and helpful.

Finally, we hope this book helps to spread out the enthusiasm and joy we have for this field and inspires people regarding their own practical or theoretical research problems.

Belfast and Hall/Tyrol, May 2012

Frank Emmert-Streib and Matthias Dehmer

References

1. Alon, U. (2006) An Introduction to Systems Biology: Design Principles of Biological Circuits, Chapman & Hall/CRC, Boca Raton, FL.

2. Barabasi, A. L. and Oltvai, Z. N. (2004) Network biology: Understanding the cell's functional organization. Nat. Rev. Genet., 5, 101–113.

3. von Bertalanffy, L. (1950) An outline of general systems theory. Brit. J. Philos. Sci., 1 (2), 134–165.

4. Emmert-Streib, F. and Glazko, G. (2011) Network biology: a direct approach to study biological function. WIREs Syst. Biol. Med., 3 (4), 379–391.

5. Palsson, B.O. (2006) Systems Biology: Properties of Reconstructed Networks, Cambridge University Press, Cambridge, UK.

List of Contributors

Yang Aijun

Nanjing Audit University

School of Finance

Jiangsu, 211815

China

Nidhal Bouaynaya

University of Arkansas at Little Rock

Department of Systems Engineering

ETAS 300 H., 2801 S. University Ave.

Little Rock, AR 72204

USA

Mark R. Chance

Case Western Reserve University

Center for Proteomics and Bioinformatics

10900 Euclid Ave., BRB 930

Cleveland, OH 44106-4988

USA

Sreenivas Chavali

MRC Laboratory of Molecular Biology

Hills Road

Cambridge CB2 0QH

UK

Li Chen

Johns Hopkins University

Department of Pathology

School of Medicine

1550 Orleans Street

Baltimore, MD 21231

USA

Matthias Dehmer

UMIT

Institut für Bioinformatik und Translationale Forschung

Eduard Wallnöfer Zentrum 1

6060 Hall/Tyrol

Austria

Ricardo de Matos Simoes

Max F. Perutz Laboratories

Center for Integrative Bioinformatics Vienna

Dr. Bohr Gasse 9

1030 Vienna

Austria

Frank Emmert-Streib

Queen's University Belfast

Computational Biology and Machine Learning Lab

Center for Cancer Research and Cell Biology

97 Lisburn Road

Belfast BT9 7BL

UK

Hassan M. Fathallah-Shaykh

University of Alabama at Birmingham

Department of Neurology School of Medicine

FOT 1020, 510 20th St South

Birmingham, AL 35294-3410

USA

Fei Gu

The Ohio State University

Department of Biomedical Informatics

Columbus, OH 43210

USA

D. Neil Hayes

University of North Carolina at Chapel Hill

UNC Lineberger Comprehensive Cancer Center

School of Medicine CB#7295

450 West Drive

Chapel Hill, NC 27599-7295

USA

Victor X. Jin

The Ohio State University

Department of Biomedical Informatics

Columbus, OH 43210

USA

Kartiek Kanduri

Turku Centre for Biotechnology

Turku

Finland

Devin C. Koestler

Dartmouth Medical School One Medical Center Drive

Section of Biostatistics & Epidemiology

7927 Rubin Building

Lebanon, NH 03756

USA

Song Liu

University at Buffalo

Department of Biostatistics

723 Kimball Tower

New York, NY 14214

USA

Shigeyuki Matsui

Kyoto University School of Public Health

Department of Biostatistics

Yoshida Konoe-cho, Sakyo-ku

Kyoto, 606-8501

Japan

Jeffrey Miecznikowski

University at Buffalo

Department of Biostatistics

723 Kimball Tower

New York, NY 14214

USA

David J. Miller

Johns Hopkins University

Department of Pathology

School of Medicine

1550 Orleans Street

Baltimore, MD 21231

USA

Andrew B. Nobel

University of North Carolina at Chapel Hill

Department of Statistics and Operations Research

Hanes Hall, CB#3260 Chapel Hill, NC 27599-3260

USA

Hisashi Noma

Kyoto University School of Public Health

Department of Biostatistics

Yoshida Konoe-cho, Sakyo-ku

Kyoto, 606-8501

Japan

Vishal N. Patel

Case Western Reserve University

Center for Proteomics and Bioinformatics

10900 Euclid Ave., BRB 930

Cleveland, OH 44106-4988

USA

Dan Schonfeld

UIC College of Engineering

Electrical and Computer Engineering

851 S. Morgan M/C 154

Chicago, IL 60607

USA

Ie-Ming Shih

Johns Hopkins University

Department of Pathology

School of Medicine

1550 Orleans Street

Baltimore, MD 21231

USA

Roman Shterenberg

University of Alabama at Birmingham

Department of Mathematics

452 Campbell Hall 1300 University Boulevard

Birmingham, AL 35294-1170

USA

Lincoln Stein

Stony Brook University

Department of Biomedical Engineering

Stony Brook, NY 11794

USA

Binhua Tang

The Ohio State University

Department of Biomedical Informatics

Columbus, OH 43210

USA

Ye Tian

Virginia Tech Research Center – Arlington

The Bradley Department of Electrical and Computer Engineering

900 N. Glebe Road

Arlington, VA 22201

USA

Shailesh Tripathi

Queen's University Belfast

Computational Biology and Machine Learning Lab

Center for Cancer Research and Cell Biology

97 Lisburn Road

Belfast BT9 7BL

UK

Aad W. van der Vaart

Vrije Universiteit

Department of Mathematics

Faculty of Sciences

De Boelelaan 1081a

1081 HV Amsterdam

The Netherlands

Wessel N. van Wieringen

Vrije Universiteit

Department of Mathematics

Faculty of Sciences

De Boelelaan 1081a

1081 HV Amsterdam

The Netherlands

Vonn Walter

University of North Carolina at Chapel Hill

UNC Lineberger Comprehensive Cancer Center

School of Medicine CB#7295

450 West Drive

Chapel Hill, NC 27599-7295

USA

Dan Wang

University at Buffalo

Department of Biostatistics

723 Kimball Tower

New York, NY 14214

USA

Yue Wang

Virginia Tech

Virginia Tech Research Center – Arlington

The Bradley Department of Electrical and Computer Engineering

900 N. Glebe Road

Arlington, VA 22201

USA

Fred A. Wright

University of North Carolina at Chapel Hill

Department of Biostatistics

4115B McGavran-Greenberg

135 Dauer Drive, Campus Box 7420

Chapel Hill, NC 27599-7420

USA

Guanming Wu

MaRS Centre

Ontario Institute for Cancer Research

South Tower

101 College Street, Suite 800

Toronto, ON M5G 0A3

Canada

Song Xinyuan

The Chinese University of Hong Kong

Department of Statistics

Hong Kong SAR

The People's Republic of China

Ka Yee Yeung

University of Washington

Department of Microbiology

Seattle, WA 98195-8070

USA

Guoqiang Yu

Johns Hopkins University

Department of Pathology

School of Medicine

1550 Orleans Street

Baltimore, MD 21231

USA

Li Yunxian

Yunnan University of Economics and Finance

School of Finance

Yunnan

China

Part One

General Overview

1

Control of Type I Error Rates for Oncology Biomarker Discovery with High-Throughput Platforms

Jeffrey Miecznikowski, Dan Wang, and Song Liu

1.1 Brief Summary

This chapter provides an overview of the genetic and proteomic high-throughput platforms and the statistical methods used to evaluate molecular biomarkers for cancer diagnosis. Commonly, these experimental platforms are used in cancer diagnosis where the biomarkers can be used to determine cancer subtypes and thus potential treatments. Because of the large amount of data from these platforms, accurate testing methods are necessary. In this chapter, we highlight the statistical methods used to evaluate each potential biomarker and limit the number of false positives under a specific error rate.

1.2 Introduction

Since the invention of microarray technology and related high-throughput technologies, researchers have been able to compile large amount of information. This amount of information enables researchers to uncover potentially new targets for therapies or to enhance our knowledge of biological systems. These high-throughput platforms have become commonly used experimental platforms in the biological realm [1]. A high-throughput platform is designed to measure large numbers (thousands or millions) of signatures in a biological organism at a given time point. These platforms are a function of the postgenomic era and are often used to determine how genomic expression is regulated or involved in biological processes. These platforms often use hybridization and sequence-based technologies such as gene expression microarrays and RNA-Seq platforms.

Specifically, these platforms and technologies have revolutionized the way researchers study cancer, especially with regard to diagnosis and prognosis. Current cancer classification consists of more than 200 subtypes of cancer [2]. In order to receive the most appropriate therapy, the clinician must identify as accurately as possible the cancer subtype, stage, and/or grade. Clinicians commonly use morphologic characteristics of biopsy specimens but “it gives very limited information and clearly misses much important tumor aspects such as rate of proliferation, capacity for invasion and metastases, and development of resistance mechanisms to certain treatment agents” [3]. Therefore, in order to improve these classification methods, new molecular diagnostic methods are needed. Thus, the huge amount of molecular information that can be extracted and integrated to find common patterns is a major advantage of these high-throughput platforms. These new technologies will allow researchers to enhance cancer diagnostics by (1) classifying tumor samples into known and new taxonomic categories, (2) discovering new diagnostic and therapeutic markers, and (3) identifying new subtypes that correlate with treatment outcome.

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!

Lesen Sie weiter in der vollständigen Ausgabe!