Metaheuristics for String Problems in Bio-informatics - Christian Blum - E-Book

Metaheuristics for String Problems in Bio-informatics E-Book

Christian Blum

0,0
139,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

So-called string problems are abundant in bioinformatics and computational biology. New optimization problems dealing with DNA or protein sequences are constantly arising and researchers are highly in need of efficient optimization techniques for solving them. One obstacle for optimization practitioners is the atypical nature of these problems which require an interdisciplinary approach in order to solve them efficiently and accurately.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 308

Veröffentlichungsjahr: 2016

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Dedication

Title

Copyright

Preface

Acknowledgments

List of Acronyms

1 Introduction

1.1. Complete methods for combinatorial optimization

1.2. Approximate methods: metaheuristics

1.3. Outline of the book

2 Minimum Common String Partition Problem

2.1. The MCSP problem

2.2. An ILP model for the UMCSP problem

2.3. Greedy approach

2.4. Construct, merge, solve and adapt

2.5. Experimental evaluation

2.6. Future work

3 Longest Common Subsequence Problems

3.1. Introduction

3.2. Algorithms for the LCS problem

3.3. Algorithms for the RFLCS problem

3.4. Future work

4 The Most Strings With Few Bad Columns Problem

4.1. The MSFBC problem

4.2. An ILP model for the MSFBC problem

4.3. Heuristic approaches

4.4. ILP-based large neighborhood search

4.5. Experimental evaluation

4.6. Future work

5 Consensus String Problems

5.1. Introduction

5.2. Organization of this chapter

5.3. The closest string problem and the close to most string problem

5.4. The farthest string problem and the far from most string problem

5.5. An ILP-based heuristic

5.6. Future work

6 Alignment Problems

6.1. Introduction

6.2. The pairwise alignment problem

6.3. The multiple alignment problem

6.4. Conclusion and future work

7 Conclusions

7.1. DNA sequencing

7.2. Founder sequence reconstruction

7.3. Final remarks

Bibliography

Index

End User License Agreement

Guide

Cover

Table of Contents

Begin Reading

Pages

C1

ii

iii

iv

v

ix

x

xi

xiii

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

175

176

177

178

179

180

181

182

183

184

185

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

G1

G2

G3

G4

G5

G6

G7

This book is dedicated to my parents Maria and Dieter, who currently pass through a difficult period of their lives.

(Christian Blum)

This book is dedicated to my daughters Iara, Mara, and Nara, the most beautiful gift that life could give me.

(Paola Festa)

Metaheuristics Set

coordinated byNicolas Monmarché and Patrick Siarry

Volume 6

Metaheuristics for String Problems in Bio-informatics

Christian Blum

Paola Festa

First published 2016 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:

ISTE Ltd

27-37 St George’s Road

London SW19 4EU

UK

www.iste.co.uk

John Wiley & Sons, Inc.

111 River Street

Hoboken, NJ 07030

USA

www.wiley.com

© ISTE Ltd 2016

The rights of Christian Blum and Paola Festa to be identified as the author of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988.

Library of Congress Control Number: 2016945029

British Library Cataloguing-in-Publication Data

A CIP record for this book is available from the British Library

ISBN 978-1-84821-812-3

Preface

DNA (deoxyribonucleic acid) acts as the information archive of most living beings. Due to the fact that a strand of DNA can be expressed as a set of four-letter character strings, so-called string problems have become abundant in bioinformatics and computational biology. Each year, new optimization problems dealing with DNA (or protein) sequences are being formulated that require efficient optimization techniques to arrive at solutions. From this perspective, bioinformatics is a burgeoning field for optimization experts and computer scientists in general. In this book, we will focus on a mixture of well-known and recent string optimization problems in the bioinformatics field. We will focus on problems that are combinatorial in nature.

One of the obstacles for optimization practitioners is the atypical nature of these problems, i.e. although combinatorial in nature, these problems are rather different to the classical traveling salesman problem or the quadratic assignment problem. This implies that a different type of expertise is required to efficiently solve many of these problems. Therefore, one of the main goals of this book is to pass on this kind of expertise and experience to newcomers to this field. The book provides several examples of very successful (hybrid) metaheuristics for solving specific string problems. One such example concerns the use of beam search (an incomplete branch and bound method) in solving longest common subsequence problems. The application of this algorithm in 2009 marked a breakthrough in the solution of this type of problem.

Finally, we would like to address a few words to the interested readers, especially biologists. We apologize for any imprecision in the description of biological processes, which we have tried to keep to a minimum. Keep in mind that, after all, we are only computer scientists and mathematicians.

Christian BLUMPaola FESTAJune 2016

Acknowledgments

This work was supported by grant TIN2012-37930-C02-02 from the Spanish Government. Support from CSIC (Spanish National Research Council) and IKERBASQUE (Basque Foundation for Science) is also acknowledged. We thank RDlab1, a BarcelonaTech facility, for allowing us to perform the experiments partly in the high-performance computing environment.

1

http://rdlab.lsi.upc.edu

.

List of Acronyms

ACO

Ant Colony Optimization

B&B

Branch & Bound

CMSA

Construct, Merge, Solve & Adapt

CO

Combinatorial Optimization

DNA

Deoxyribonucleic Acid

DP

Dynamic Programming

EA

Evolutionary Algorithm

GA

Genetic Algorithm

ILP

Integer Linear Programming

ILS

Iterated Local Search

IP

Integer Programming

LNS

Large Neighborhood Search

MCSP

Minimum Common String Partition

MSFBC

Most Strings With Few Bad Columns

RNA

Ribonucleic Acid

SA

Simulated Annealing

TS

Tabu Search

TSP

Traveling Salesman Problem

UMCSP

Unbalanced Minimum Common String Partition