139,99 €
So-called string problems are abundant in bioinformatics and computational biology. New optimization problems dealing with DNA or protein sequences are constantly arising and researchers are highly in need of efficient optimization techniques for solving them. One obstacle for optimization practitioners is the atypical nature of these problems which require an interdisciplinary approach in order to solve them efficiently and accurately.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 308
Veröffentlichungsjahr: 2016
Cover
Dedication
Title
Copyright
Preface
Acknowledgments
List of Acronyms
1 Introduction
1.1. Complete methods for combinatorial optimization
1.2. Approximate methods: metaheuristics
1.3. Outline of the book
2 Minimum Common String Partition Problem
2.1. The MCSP problem
2.2. An ILP model for the UMCSP problem
2.3. Greedy approach
2.4. Construct, merge, solve and adapt
2.5. Experimental evaluation
2.6. Future work
3 Longest Common Subsequence Problems
3.1. Introduction
3.2. Algorithms for the LCS problem
3.3. Algorithms for the RFLCS problem
3.4. Future work
4 The Most Strings With Few Bad Columns Problem
4.1. The MSFBC problem
4.2. An ILP model for the MSFBC problem
4.3. Heuristic approaches
4.4. ILP-based large neighborhood search
4.5. Experimental evaluation
4.6. Future work
5 Consensus String Problems
5.1. Introduction
5.2. Organization of this chapter
5.3. The closest string problem and the close to most string problem
5.4. The farthest string problem and the far from most string problem
5.5. An ILP-based heuristic
5.6. Future work
6 Alignment Problems
6.1. Introduction
6.2. The pairwise alignment problem
6.3. The multiple alignment problem
6.4. Conclusion and future work
7 Conclusions
7.1. DNA sequencing
7.2. Founder sequence reconstruction
7.3. Final remarks
Bibliography
Index
End User License Agreement
Cover
Table of Contents
Begin Reading
C1
ii
iii
iv
v
ix
x
xi
xiii
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
175
176
177
178
179
180
181
182
183
184
185
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
G1
G2
G3
G4
G5
G6
G7
This book is dedicated to my parents Maria and Dieter, who currently pass through a difficult period of their lives.
(Christian Blum)
This book is dedicated to my daughters Iara, Mara, and Nara, the most beautiful gift that life could give me.
(Paola Festa)
Metaheuristics Set
coordinated byNicolas Monmarché and Patrick Siarry
Volume 6
Christian Blum
Paola Festa
First published 2016 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.
Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:
ISTE Ltd
27-37 St George’s Road
London SW19 4EU
UK
www.iste.co.uk
John Wiley & Sons, Inc.
111 River Street
Hoboken, NJ 07030
USA
www.wiley.com
© ISTE Ltd 2016
The rights of Christian Blum and Paola Festa to be identified as the author of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988.
Library of Congress Control Number: 2016945029
British Library Cataloguing-in-Publication Data
A CIP record for this book is available from the British Library
ISBN 978-1-84821-812-3
DNA (deoxyribonucleic acid) acts as the information archive of most living beings. Due to the fact that a strand of DNA can be expressed as a set of four-letter character strings, so-called string problems have become abundant in bioinformatics and computational biology. Each year, new optimization problems dealing with DNA (or protein) sequences are being formulated that require efficient optimization techniques to arrive at solutions. From this perspective, bioinformatics is a burgeoning field for optimization experts and computer scientists in general. In this book, we will focus on a mixture of well-known and recent string optimization problems in the bioinformatics field. We will focus on problems that are combinatorial in nature.
One of the obstacles for optimization practitioners is the atypical nature of these problems, i.e. although combinatorial in nature, these problems are rather different to the classical traveling salesman problem or the quadratic assignment problem. This implies that a different type of expertise is required to efficiently solve many of these problems. Therefore, one of the main goals of this book is to pass on this kind of expertise and experience to newcomers to this field. The book provides several examples of very successful (hybrid) metaheuristics for solving specific string problems. One such example concerns the use of beam search (an incomplete branch and bound method) in solving longest common subsequence problems. The application of this algorithm in 2009 marked a breakthrough in the solution of this type of problem.
Finally, we would like to address a few words to the interested readers, especially biologists. We apologize for any imprecision in the description of biological processes, which we have tried to keep to a minimum. Keep in mind that, after all, we are only computer scientists and mathematicians.
Christian BLUMPaola FESTAJune 2016
This work was supported by grant TIN2012-37930-C02-02 from the Spanish Government. Support from CSIC (Spanish National Research Council) and IKERBASQUE (Basque Foundation for Science) is also acknowledged. We thank RDlab1, a BarcelonaTech facility, for allowing us to perform the experiments partly in the high-performance computing environment.
1
http://rdlab.lsi.upc.edu
.
ACO
Ant Colony Optimization
B&B
Branch & Bound
CMSA
Construct, Merge, Solve & Adapt
CO
Combinatorial Optimization
DNA
Deoxyribonucleic Acid
DP
Dynamic Programming
EA
Evolutionary Algorithm
GA
Genetic Algorithm
ILP
Integer Linear Programming
ILS
Iterated Local Search
IP
Integer Programming
LNS
Large Neighborhood Search
MCSP
Minimum Common String Partition
MSFBC
Most Strings With Few Bad Columns
RNA
Ribonucleic Acid
SA
Simulated Annealing
TS
Tabu Search
TSP
Traveling Salesman Problem
UMCSP
Unbalanced Minimum Common String Partition