Deep Learning Applications in Medical Image Segmentation -  - E-Book

Deep Learning Applications in Medical Image Segmentation E-Book

0,0
133,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.
Beschreibung

Apply revolutionary deep learning technology to the fast-growing field of medical image segmentation

Precise medical image segmentation is rapidly becoming one of the most important tools in medical research, diagnosis, and treatment. The potential for deep learning, a technology which is already revolutionizing practice across hundreds of subfields, is immense. The prospect of using deep learning to address the traditional shortcomings of image segmentation demands close inspection and wide proliferation of relevant knowledge.

Deep Learning Applications in Medical Image Segmentation meets this demand with a comprehensive introduction and its growing applications. Covering foundational concepts and its advanced techniques, it offers a one-stop resource for researchers and other readers looking for a detailed understanding of the topic. It is deeply engaged with the main challenges and recent advances in the field of deep-learning-based medical image segmentation.

Readers will also find:

  • Analysis of deep learning models, including FCN, UNet, SegNet, Dee Lab, and many more
  • Detailed discussion of medical image segmentation divided by area, incorporating all major organs and organ systems
  • Recent deep learning advancements in segmenting brain tumors, retinal vessels, and inner ear structures
  • Analyzes the effectiveness of deep learning models in segmenting lung fields for respiratory disease diagnosis
  • Explores the application and benefits of Generative Adversarial Networks (GANs) in enhancing medical image segmentation
  • Identifies and discusses the key challenges faced in medical image segmentation using deep learning techniques
  • Provides an overview of the latest advancements, applications, and future trends in deep learning for medical image analysis

Deep Learning Applications in Medical Image Segmentation is ideal for academics and researchers working with medical image segmentation, as well as professionals in medical imaging, data science, and biomedical engineering.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 502

Veröffentlichungsjahr: 2025

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Cover

Table of Contents

Title Page

Copyright

Acknowledgments

List of Contributors

Preface

Introduction

1 Introduction to Medical Image Segmentation: Overview of Modalities, Benchmark Datasets, Data Augmentation Techniques, and Evaluation Metrics

1.1 Introduction

1.2 Datasets for Segmentation of Medical Images

1.3 Augmentation Techniques Used in Medical Image Segmentation

1.4 Performance Metrics for Evaluating Segmentation Models

1.5 Conclusion

References

2 Fundamentals of Deep Learning Models for Medical Image Segmentation

2.1 Introduction

2.2 Deep Learning Models for Medical Image Segmentation

2.3 Applications of Medical Image Segmentation Models

2.4 Current Challenges in Segmentation of Medical Images

2.5 Conclusion

References

3 Revealing Historical Insights: A Comprehensive Exploration of Traditional Approaches in Medical Image Segmentation

3.1 Introduction

3.2 Literature Review

3.3 Methodology

3.4 Historical Context

3.5 Segmentation

3.6 Challenges and Opportunities

3.7 Case Studies

3.8 Modern Era and Contemporary Techniques

3.9 Conclusion

References

4 Segmentation and Quantitative Analysis of Myelinated White Matter Tissue in Pediatric Brain Magnetic Resonance Images

4.1 Introduction

4.2 Literature Review

4.3 Methodology

4.4 Results

4.5 Discussion

4.6 Conclusion

References

5 Deep Learning Transformations in Medical Imaging: Advancements in Brain Tumor, Retinal Vessel, and Inner Ear Segmentation

5.1 Introduction

5.2 Classical Image Segmentation Techniques

5.3 Deep Learning-Based Image Segmentation Methods for Medical Images

5.4 Deep Learning Algorithms Employed in the Segmentation of Brain Tumor Images

5.5 Deep Learning Models for Retinal Vessel Segmentation

5.6 Deep Learning Models for Inner Ear Segmentation

5.7 Conclusion

References

6 Deep Learning-Based Image Segmentation for Early Detection of Diabetic Retinopathy and Other Retinal Disorders

6.1 Introduction

6.2 Deep Learning and Image Segmentation

6.3 Applications and Benefits of Deep Learning-Based Image Segmentation

6.4 Challenges and Limitations

6.5 Conclusions and Future Directions

References

7 Analysis of Deep Learning Models for Lung Field Segmentation

7.1 Introduction

7.2 Medical Imaging Modalities

7.3 Overview of Classical Approaches for Lung Segmentation in Chest X-rays

7.4 Deep Learning Approaches

7.5 Data Sources and Datasets

7.6 Evaluation Metrics

7.7 Conclusion

References

8 Generative Adversarial Networks in the Field of Medical Image Segmentation

8.1 Introduction

8.2 Overview of Image Segmentation Techniques

8.3 Generative Adversarial Networks

8.4 Classification of GAN-Based Image Segmentation Techniques

8.5 Conclusion

References

9 A Collaborative Cell Image Segmentation Model Based on the Multilevel Improvement of Data

9.1 Introduction

9.2 Methodology

9.3 Result and Discussion

9.4 Conclusion and Future Scope

Acknowledgments

References

10 Challenges and Future Directions for Segmentation of Medical Images Using Deep Learning Models

10.1 Introduction

10.2 Types of Medical Datasets

10.3 Challenges Related to the Dataset

10.4 Challenges Concerning the DL Models

10.5 Conclusion

References

11 Advancements in Deep Learning for Medical Image Analysis: A Comprehensive Exploration of Techniques, Applications, and Future Prospects

11.1 Introduction

11.2 Significance of Medical Image Segmentation

11.3 Deep Learning Techniques for Medical Image Segmentation

11.4 Applications of Deep Learning in Medical Image Segmentation

11.5 Challenges and Future Prospects

11.6 Conclusion

References

Index

End User License Agreement

List of Tables

Chapter 1

Table 1.1 Datasets for segmentation of medical images.

Table 1.2 Some of the state-of-the-art deep generative models for data augme...

Chapter 2

Table 2.1 Summary of eye segmentation models based on deep learning.

Table 2.2 Summary of brain segmentation models based on deep learning.

Table 2.3 Summary of liver segmentation models based on deep learning.

Table 2.4 Summary of lung segmentation models based on deep learning.

Table 2.5 Summary of kidney segmentation models based on deep learning.

Table 2.6 Summary of heart segmentation models based on deep learning.

Table 2.7 Summary of multi-organ segmentation models based on deep learning....

Chapter 3

Table 3.1 Strengths and limitations of traditional approaches in medical ima...

Table 3.2 Comparison of deep learning architectures for medical image segmen...

Chapter 4

Table 4.1 Different growth models.

Table 4.2 AIC and BIC values.

Table 4.3 Akaike and Bayesian weights and ERs.

Table 4.4 Simple and complex models for hemispheric myelination.

Table 4.5 Comparison with other myelination models.

Chapter 6

Table 6.1 Comparison of DL models.

Chapter 7

Table 7.1 Comparative analysis of the lung segmentation models.

Chapter 8

Table 8.1 GAN-based segmentation method for brain.

Table 8.2 GAN-based segmentation method for eye.

Table 8.3 GAN-based segmentation method for cardiology.

Table 8.4 GAN-based segmentation method for chest.

Table 8.5 GAN-based segmentation method for breast.

Table 8.6 GAN-based segmentation method for spine.

Table 8.7 GAN-based Segmentation Methods for Abdomen.

Table 8.8 GAN-based segmentation method for pelvic.

Table 8.9 GAN-based segmentation method using MRI as modality.

Table 8.10 Segmentation of CT images using GAN based Methods.

Table 8.11 Segmentation of other modalities using GAN based Methods.

Table 8.12 Segmentation using U-Net based GAN models.

Table 8.13 Segmentation Using Conditional GAN based Models.

Table 8.14 Segmentation Using cycle GAN based Models.

Table 8.15 Segmentation Using other GAN based Models.

Chapter 9

Table 9.1 Quantitative comparison of data integration operation (image fusio...

Table 9.2 Quantitative comparison of post-fusion operation.

Chapter 10

Table 10.1 Challenges concerning datasets and their solutions.

Table 10.2 Challenges concerning deep learning models and their solutions.

Chapter 11

Table 11.1 Detailed description of CNN layers.

Table 11.2 Summary of advanced techniques.

Table 11.3 Detailed description, applications of lesion detection analysis, ...

Table 11.4 Applications of organ segmentation in preoperative planning.

Table 11.5 Explainable AI techniques for deep learning models.

List of Illustrations

Chapter 1

Figure 1.1 Different kinds of medical imaging modalities.

Chapter 2

Figure 2.1 Overview of convolutional neural network.

Figure 2.2 Fully convolutional neural network overview.

Figure 2.3 Overview of SegNet model.

Figure 2.4 SegNet decoder with feature map values a, b, c, and d. SegNet fir...

Figure 2.5 Basic U-Net architecture.

Figure 2.6 (a) Residual block. (b) Dense block.

Figure 2.7 UNet++ architecture.

Figure 2.8 Pyramid-based architecture.

Figure 2.9 An illustration of a basic recurrent neural network with the foll...

Figure 2.10 GAN architecture.

Chapter 3

Figure 3.1 Flowchart illustrating the progression from traditional approache...

Chapter 4

Figure 4.1 Flow chart of the proposed method.

Figure 4.2 Inter-hemispheric separation: (a) extracted brain image, (b) pseu...

Figure 4.3 Segmentation results: (a) input T1 image, (b) pre-processed and s...

Figure 4.4 3D visualization of MWM in the pediatric brain (babies are aged: ...

Figure 4.5 Growth models for myelination.

Chapter 5

Figure 5.1 A taxonomy of classical image segmentation techniques.

Figure 5.2 A CNN architecture having five layers.

Figure 5.3 A visual representation of a convolutional layer.

Figure 5.4 U-Net architecture.

Figure 5.5 GoogleNet with all its bells and whistles.

Chapter 6

Figure 6.1 (a) Original image of eye, (b) optic disc, (c) microaneurysm, (d)...

Figure 6.2 (a) Original image, (b) mask image, (c) ground truth, and (d) fin...

Chapter 7

Figure 7.1 Overview of diverse medical imaging techniques.

Figure 7.2 Diagrammatic Representation of Rule-Based Lung Segmentation Workf...

Figure 7.3 Deformable-based lung segmentation process.

Figure 7.4 Pixel-based lung segmentation process.

Figure 7.5 Basic U-Net architecture for medical image segmentation.

Figure 7.6 Architectural overview of Deeplabv3+ for semantic lung segmentati...

Figure 7.7 GAN architecture overview.

Chapter 8

Figure 8.1 GAN framework.

Figure 8.2 Classification of GAN-based medical image segmentation methods.

Chapter 9

Figure 9.1 Framework of a proposed medical image fusion model.

Figure 9.2 An example of a cell image that depicts four-level decomposition ...

Figure 9.3 Four-level data integration operation on cell imaging sample.

Figure 9.4 Max-pooling operation.

Figure 9.5 Detailed U-net architecture.

Figure 9.6 Performance of U-net on the different data enhancement levels.

Figure 9.7 Performance of data integration operation for different metrics, ...

Figure 9.8 An example of medical images selected for collaborative fusion st...

Figure 9.9 Qualitative comparison of different cell image segmentation metho...

Chapter 10

Figure 10.1 The distribution of modalities utilized in medical imaging expre...

Figure 10.2 Chest X-ray samples.

Figure 10.3 Detection of lesions using mammogram images.

Figure 10.4 Microscopic image of breast tumor tissue with four different mag...

Figure 10.5 The various challenges related to the datasets.

Chapter 11

Figure 11.1 Gradient descent with backpropagation.

Figure 11.2 Basic CNN architecture.

Figure 11.3 Data augmentation in medical images.

Figure 11.4 Flow diagram for handling data variability.

Figure 11.5 Anatomical information.

Figure 11.6 Flowchart of federated learning for medical image segmentation....

Guide

Cover

Table of Contents

Title Page

Copyright

Acknowledgments

List of Contributors

Preface

Introduction

Begin Reading

Index

End User License Agreement

Pages

ii

iii

iv

xv

xvii

xviii

xix

xxi

xxii

xxiii

xxiv

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

 

 

IEEE Press445 Hoes LanePiscataway, NJ 08854

 

IEEE Press Editorial BoardSarah Spurgeon, Editor-in-Chief

 

Moeness Amin

Jón Atli Benediktsson

Adam Drobot

James Duncan

Ekram Hossain

Brian Johnson

Hai Li

James Lyke

Joydeep Mitra

Desineni Subbaram Naidu

Tony Q. S. Quek

Behzad Razavi

Thomas Robertazzi

Diomidis Spinellis

Deep Learning Applications in Medical Image Segmentation

Overview, Approaches, and Challenges

 

Edited by

Sajid Yousuf BhatDepartment of Computer ScienceUniversity of KashmirSrinagarIndia

Aasia RehmanDepartment of Computer ScienceUniversity of KashmirSrinagarIndia

Muhammad AbulaishDepartment of Computer ScienceSouth Asian UniversityNew DelhiIndia

 

 

 

Copyright © 2025 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.

Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc. is not associated with any product or vendor mentioned in this book.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read. Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.

For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data applied for

Hardback ISBN: 9781394245338

Cover Design: WileyCover Images: © MR.Cole_Photographer/Getty Images, © Krongkaew/Getty Images

Acknowledgments

The journey to completing Deep Learning Applications in Medical Image Segmentation has been deeply collaborative, and we are immensely grateful to all those who have made significant contributions to this work.

First and foremost, we express our sincere appreciation to the contributing authors. Their dedication, expertise, and hard work have been the foundation of this book. Their insightful chapters have enriched this volume and made it a comprehensive resource for the field of medical image segmentation.

We are also thankful to the reviewers whose valuable feedback and constructive suggestions were instrumental in refining the content. Their critical insights ensured the highest standard of quality, and their contributions are greatly appreciated.

Lastly, we acknowledge our families and friends for their unwavering support and encouragement during the preparation of this book. Their understanding and patience have been a source of strength and motivation.

List of Contributors

 

Zachariah C. Alex

School of Electronics Engineering

VIT University

Vellore

India

 

Tairah Andrabi

Department of Computer Science

University of Kashmir

Srinagar

Jammu and Kashmir

India

 

Mudasir Ashraf

School of Computer Science & IT

Univeristy of the People

CA

USA

 

Rita Banik

Department of Electrical Engineering ICFAI University Tripura

Kamalghat

West Tripura

India

 

Rejaul Karim Barbhuiya

Central Institute of Educational Technology (CIET)

National Council of Educational Research and Training (NCERT)

New Delhi

India

 

Sajid Yousuf Bhat

Department of Computer Science

University of Kashmir

Srinagar

Jammu and Kashmir

India

 

Roshan Birjais

Department of Electrical, Computer and Software Engineering

Faculty of Engineering

The University of Auckland

Auckland

New Zealand

 

Ankur Biswas

Department of Computer Science and Engineering

Tripura Institute of Technology

Narsingarh

West Tripura

India

 

Manzoor Chachoo

Department of computer science

University of Kashmir

Srinagar

Jammu and Kashmir

India

 

Anupama Chandrasekharan

Department of Radiology

Sri Ramachandra University

Chennai

India

 

Chelli N. Devi

Department of Biomedical Engineering

Kalasalingam Academy of Research and Education

Sriviliputhur

 

Vani Malagar

Model Institute of Engineering and Technology

Department of Computer Science and Engineering

Jammu

India

 

Aijaz Mir

Department of Computer Science

University of Kashmir

Srinagar

Jammu and Kashmir

India

 

Suhail Qadir Mir

Department of Informatics and Computer Systems

King Khalid University

Abha

Saudi Arabia

 

Chayan Paul

Department of Computer Science and Engineering

Swami Vivekananda University

Barakpore

West Bengal

India

 

Aasia Rehman

Department of Computer Science

University of Kashmir

Srinagar

Jammu and Kashmir

India

 

Lubna Riyaz

Department of Computer Science

University of Kashmir

Srinagar

Jammu and Kashmir

India

 

Mekhla Sharma

Model Institute of Engineering and Technology

Department of Computer Science and Engineering

Jammu

India

 

Ishfaq Sheikh

Department of Computer Science

University of Kashmir

Srinagar

Jammu and Kashmir

India

 

Bisma Sultan

Department of Computer Science

University of Kashmir

Srinagar

Jammu and Kashmir

India

 

V. K. Sundararaman

Independent Consultant

MR Imaging and Former Adjunct Professor

VIT University

Vellore

India

 

Navin Mani Upadhyay

Model Institute of Engineering and Technology

Department of Computer Science and Engineering

Jammu

India

 

Majid Zaman

Scientist D

University of Kashmir

Jammu and Kashmir

India

Preface

The field of medical image segmentation has seen tremendous advancements over the past decade, driven largely by the rapid development of deep learning technologies. These advancements have enabled the medical community to achieve levels of precision and efficiency in image analysis that were once thought impossible. Recognizing the profound impact of these technologies, we set out to create a comprehensive resource that would serve as both an introduction to the field and a guide to the latest research and techniques.

Deep Learning Applications in Medical Image Segmentation was conceived with the vision of providing readers with a thorough understanding of how deep learning is revolutionizing the way we approach medical image segmentation. This book brings together contributions from experts in the field, each offering their unique insights into various aspects of deep learning and its application to medical imaging.

The book is organized to guide readers from foundational concepts to advanced techniques. It begins with an introduction to the basics of medical image segmentation, including key datasets, data augmentation methods, and evaluation metrics. As readers progress, they will encounter discussions on traditional segmentation approaches, providing a historical context that highlights the evolution of methodologies leading to the current state-of-the-art deep learning models.

In subsequent chapters, the book delves into specific applications of deep learning in medical imaging, covering critical areas such as brain tumor segmentation, retinal vessel segmentation, and early detection of diabetic retinopathy. These chapters not only explore the techniques employed but also address the challenges and limitations that researchers face in these specialized domains.

The book also covers cutting-edge topics such as the use of generative adversarial networks (GANs) for image segmentation and collaborative models for cell image segmentation. By including these advanced topics, we aim to provide readers with a forward-looking perspective on the future of medical image segmentation.

We believe this book will serve as a valuable resource for a wide audience, including researchers, practitioners, and students. It is designed to equip readers with both the theoretical knowledge and practical insights needed to navigate the rapidly evolving landscape of medical image segmentation. Our hope is that this book will inspire further research and innovation in this vital field, ultimately contributing to improved healthcare outcomes.

As editors, we are deeply grateful to the contributing authors, reviewers, and all those who have supported this project. Their expertise, dedication, and collaborative spirit have been instrumental in bringing this book to fruition. We trust that readers will find this book to be an informative and enriching resource that will aid them in their professional development and research endeavors.

Introduction

Medical image segmentation is a critical component in the field of medical imaging, enabling precise identification and analysis of anatomical structures and abnormalities. With the exponential growth in imaging technologies and the availability of vast medical datasets, the need for accurate, efficient, and automated segmentation methods has never been more pressing. Deep learning, a subset of artificial intelligence, has emerged as a transformative approach, offering unprecedented capabilities in processing and interpreting complex medical images.

This book, titled Deep Learning Applications in Medical Image Segmentation, aims to provide a comprehensive and in-depth exploration of how deep learning models are revolutionizing this field. By consolidating the latest research, methodologies, and practical applications, it serves as a one-stop resource for readers seeking to understand the challenges and advancements in medical image segmentation using deep learning architectures.

The book begins with a foundational overview of medical image segmentation, discussing various imaging modalities, benchmark datasets, data augmentation techniques, and evaluation metrics essential for developing robust segmentation models. It then delves into the historical context and traditional approaches that laid the groundwork for modern techniques, offering valuable insights into the evolution of this field.

Subsequent chapters focus on the application of deep learning models to specific areas of medical imaging, including brain tumor segmentation, retinal vessel segmentation, inner ear segmentation, and early detection of diabetic retinopathy. Each chapter provides a thorough examination of the methodologies, challenges, and future directions in these specialized areas, highlighting the practical impact of deep learning in improving patient outcomes.

The book also explores advanced topics such as the use of generative adversarial networks (GANs) in medical image segmentation, collaborative models for cell image segmentation, and the ongoing challenges faced by researchers in this rapidly evolving domain. The final chapters provide a forward-looking perspective, discussing the future prospects of deep learning in medical imaging and the potential hurdles that must be overcome to fully realize its potential.

Through this comprehensive approach, Deep Learning Applications in Medical Image Segmentation equips readers with the knowledge and tools necessary to navigate and contribute to this dynamic field. Whether you are a researcher, practitioner, or student, this book offers valuable insights into the application of deep learning to one of the most challenging and impactful areas of medical science.

1Introduction to Medical Image Segmentation: Overview of Modalities, Benchmark Datasets, Data Augmentation Techniques, and Evaluation Metrics

Aasia Rehman1 and Suhail Qadir Mir2

1Department of Computer Science, University of Kashmir, Srinagar, Jammu and Kashmir, India

2Department of Informatics and Computer Systems, King Khalid University, Abha, Saudi Arabia

1.1 Introduction

Medical image analysis is an essential part of monitoring the clinical progression of the disease and how the patient is responding to a certain kind of treatment and thus helps in further planning of treatment for patients. Medical imaging has sped up the diagnosis and treatment of a number of illnesses. There are generally many different kinds of medical imaging modalities that make use of different techniques to produce images for various purposes. Figure 1.1 shows the examples of different medical images. In today’s clinical practice, imaging modalities such as X-ray, computed tomography (CT), magnetic resonance imaging (MRI), and US are commonly used. These medical images either examine multiple organs (e.g. MRI and CT) or are specific to particular organs (e.g. retinal images, mammograms, dermoscopy, and colonoscopy). The volume of data produced from each modality also varies (Shen et al. 2017). Each imaging modality offers distinct advantages for various applications and diagnostic objectives, as well as limits in revealing the internal structure and functions of the many organs of the body. X-ray, CT, MRI, US, dermoscopy, colonoscopy, microscopic images, and optical coherence tomography (OCT) are just a few examples of the most widely used medical imaging modalities that will be discussed in this chapter. In addition to this, we discuss their basic principle of working along with their usage.

1.1.1 X-Rays

In the medical field, X-ray imaging is employed to generate detailed images of internal anatomical structures, such as bones, aiding in the diagnostic process. X-rays, classified as electromagnetic waves, possess the capability to penetrate various solid objects, including the human body. Among the common imaging modalities, X-rays are frequently utilized to visualize internal bodily structures. The procedure involves directing a focused beam of X-rays toward the specific region of interest within the body. Upon traversing through the body, the X-rays are captured by a sensitive detector, such as film or a digital sensor. Different tissues within the body interact with X-rays differently; softer tissues, like muscles and organs, absorb fewer X-rays, resulting in darker shades on the resultant image, while denser tissues, like bones, absorb more X-rays, manifesting as white areas on the image. A proper amount of radiation must be used, and the body component must be at the ideal location relative to the X-ray beam, for a clear image to be produced. To get a more accurate depiction of the body’s three-dimensional (3D) structure, many pictures of the same location are acquired from different angles and superimposed to create a two-dimensional (2D) image. A radiologist, a specialized medical practitioner, focuses on analyzing medical images to diagnose various conditions such as fractures, cancers, and other ailments by interpreting the visual information depicted in the images. An X-ray is electromagnetic radiation that is produced in a glass tube that has been evacuated. A cathode and an anode create a voltage gap, which propels electrons across the vacuum of the tube and toward the revolving tungsten anode. When electrons impact the anode, they generate both X-rays and heat. X-rays are exclusively produced when there is a voltage differential applied across the cathode and anode, rendering the X-ray tube inert until activated by the medical radiation technologist. X-ray, CT, fluoroscopy, and angiography are all types of imaging that employ this physical on/off setup. X-rays are commonly employed to capture images of bones, primarily to assess for fractures or other abnormalities. Additionally, dentists and orthodontists utilize X-rays to obtain detailed views of teeth. Moreover, X-rays are instrumental in detecting tumors or abnormalities in bones. They are also used to guide surgeons during surgery. X-ray images of the breast are obtained during mammography to find and assess anomalies like tumors or microcalcifications. As it can spot probable cancer signals before they can be felt during a physical examination, it is a crucial tool in the early identification and prevention of breast cancer. X-rays are not only utilized for imaging the bones but also the chest and lungs to diagnose diseases, including pneumonia, lung cancer, and emphysema. In addition to detecting gallstones, kidney stones, and intestinal obstruction, X-ray imaging can be used to evaluate abdominal organs such as the liver, spleen, and kidneys. Figure 1.1l represents an example of chest X-ray. Due to its broad accessibility, speed, and affordability, X-ray imaging stands as a widely favored diagnostic method. It is a noninvasive procedure, eliminating the need for incisions or injections, thus ensuring a reasonably safe and comfortable experience for the patient. Low tissue contrast is one of the drawbacks of X-ray imaging. Distinguishing between malignancies and healthy tissue is more difficult than with other modalities. X-ray imaging also has the drawback of only providing a 2D picture of the body’s 3D structure, which might make it impossible to see certain internal structures, especially in inaccessible locations. Some aspects of interior structures can be obscured by the lack of depth in the 2D image, making it harder to make accurate diagnoses.

Figure 1.1 Different kinds of medical imaging modalities.

1.1.2 Computed Tomography (CT)

A CT scan, or computerized tomography scan, generates intricate images of the body’s internal structures by combining X-rays and computer technology. Unlike traditional X-rays, CT scans produce cross-sectional images, offering a different perspective of the body. These scans are noninvasive, ensuring a painless experience for the patient. By aiming X-rays at the body from various angles and measuring the intensity of the X-rays as they move through the body with detectors, CT scans can provide high-resolution images of internal anatomy. CT scans use a multi-slice detector, a specialized sort of X-ray detector that can gather many images simultaneously from different angles to provide detailed cross-sectional views of the body. Spatial filtering, which reduces noise and boosts contrast, and multi-energy imaging, which employs X-rays of varying energies to record distinct data and enhance contrast, are two more methods used in CT scans to improve image quality. CT scans produce images that can be viewed in a variety of formats, including cross-sectional slices, 3D images, and even virtual reality images, thanks to computer processing. Radiologic technologists do CT scans, while radiologists, medical specialists with training in image interpretation, decipher the results. When using a CT scanner, the patient will lie on a table that will be moved into the center of a machine in the shape of a doughnut. The scanner’s internal X-ray tube spins to produce a sequence of X-ray beams at varying angles as it rounds the patient. The X-ray light creates a pattern of attenuation as it goes through the body because it is absorbed at different rates by various tissues. An array of detectors positioned opposite the X-ray tube picks up the pattern of attenuated X-rays and turns them into electrical impulses. The data from these electrical signals is then transferred to a computer, where complex algorithms are used to reassemble the information as a 3D image. CT scans are employed to enhance the examination of soft tissues and intricate areas of the image that may not be clearly defined by traditional X-rays. They are commonly employed to visualize and assess blood vessels, internal organs, bones, the brain, neck, spine, and chest. CT scans aid medical professionals in various diagnostic tasks, including tumor detection, fracture evaluation, and monitoring the impact of cancer treatment on patients.

Figure 1.1a shows an example of a CT scan of the lungs. CT imaging’s high resolution is a big benefit since it allows doctors to see anatomical details such as bone structure, organ function, and blood vessel structure. This makes it beneficial for diagnosing a large variety of illnesses. In addition to its usefulness as a diagnostic tool, CT imaging technology is commonly available in settings such as hospitals, clinics, and doctors’ offices. In addition, CT imaging is quick and efficient, with data typically being made available in a short amount of time. Ionizing radiation exposure is a key concern with CT imaging since it can increase the risk of cancer and other health issues, especially with repeated scans or in pregnant women. CT scanners also represent a large financial commitment, which can be out of reach for some hospitals. Because of this, some patients may not be able to get their hands on this form of imaging. The advantages and disadvantages of CT scans must be weighed carefully, and when possible, other imaging methods should be considered.

1.1.3 Medical Resonance Imaging (MRI)

MRI, or magnetic resonance imaging, shares similarities with CT scans but offers superior quality, producing detailed cross-sectional images of body structures. Similar to CT scans, MRIs are painless and safe, as magnetic fields and radio waves do not pose any adverse effects on patients. MRI utilizes a robust magnetic field to align the nuclei of hydrogen atoms within the body, enabling the generation of precise images of internal structures. This generates a minute magnetic moment that can be exploited to provide accurate representations of the body’s internal anatomy. First, a weak radio signal is emitted from the rearranged hydrogen nuclei thanks to the use of radiofrequency (RF) pulses. A detector picks up on these signals, and after being analyzed by a computer, we get clear pictures of the body’s inner workings. The hydrogen atoms in a patient’s body are rearranged when patients enter an MRI machine and are subjected to a powerful magnetic field. Then, a weak radio signal is produced by the aligned hydrogen atoms after radio waves are delivered through the body. An antenna picks up the radio wave, and the information is transmitted to a computer for processing. The data is processed by sophisticated algorithms to create a 3D model of the body’s internal anatomy. Hydrogen atoms, which are present in every cell in the body, are found in variable concentrations depending on the kind of tissue. Axial, sagittal, and coronal pictures can be generated in an MRI, giving a full 3D picture of the patient’s anatomy.

MRIs are employed for observing the inner composition of the brain, spinal cord, bones, heart, blood vessels, and various other internal bodily structures. In Figure 1.1b–d represent an example of MR T1-weighted, T2-weighted, and Flair, respectively. MRI is favored over alternative imaging methods such as X-rays and CT scans due to its avoidance of harmful ionizing radiation exposure to patients. Soft tissues, including muscles, tendons, and ligaments, can also be seen in exquisite detail by these instruments. They can provide images on different planes, providing a more comprehensive view of internal structures. In addition to its other benefits, MRI’s ability to deliver microscopic resolution without causing any damage or discomfort is a major selling point. Nonetheless, MRI scanners are responsive to metal; thus, individuals with specific implants, such as pacemakers, are ineligible for MRI scans.

Scanners can be a considerable investment due to their high cost, making them out of reach for some establishments. Examination times may also be greater compared to those of X-rays or CT scans.

1.1.4 Positron Emission Tomography (PET)

A PET scan produces a 3D image of the inner body parts. To see and quantify metabolic and functional activities taking place inside the body, doctors use PET scans. In the medical field, it is frequently employed for the detection, classification, and follow-up of a wide range of conditions, including cancer and neurological diseases. A PET scan works by looking for radiotracers that release positrons. These radiotracers are substances that are injected into the body and emit positrons, which are positively charged particles. When a positron meets an electron within the body, they mutually annihilate, generating two gamma rays that propagate in opposing directions. These gamma rays are identified by the PET scanner and employed to construct 3D images depicting the distribution of the radiotracer within the body. PET scans are widely used in oncology to assess tumor growth, evaluate treatment response, and detect metastases. PET scans serve various purposes, notably in generating high-resolution brain images and forecasting cancer progression. They are commonly employed for patients already diagnosed with cancer, as PET scans can accurately illustrate the extent of cancer spread and evaluate the efficacy of chemotherapy. Additionally, PET scans aid in pre-surgical planning for procedures involving the brain or heart. Moreover, conditions such as Alzheimer’s disease, epilepsy, and Parkinson’s disease can be diagnosed using PET scans, as they provide clear insights into changes in brain functionality. In Figure 1.1i–k represent an example of PET scans for normal, mild cognitive impairment, and Alzheimer’s disease, respectively. PET scans can detect functional and metabolic changes in tissues before structural abnormalities become apparent through other imaging techniques. PET scans can offer a thorough assessment of the entire body, allowing for the detection of metastatic disease or the evaluation of multiple organ systems simultaneously. Unlike other imaging techniques, PET scans provide information about the physiological and biochemical processes occurring within the body, enabling a better understanding of disease mechanisms. PET scans involve the use of radioactive substances, which expose the patient to a small amount of ionizing radiation. While the radiation dose is generally considered safe, it is still a factor to consider, especially for pregnant women and children. It may be difficult to correctly localize tiny lesions on PET scans because of their potentially inferior spatial resolution compared to other forms of imaging like CT or MRI. PET scans can be more expensive than other imaging techniques, mainly due to the production and administration of radiotracers, as well as the specialized equipment required for scanning and image analysis. PET scanners are not as widely available as other imaging modalities, which can limit access to this technology in certain areas. It is important to note that PET scans are typically used in conjunction with other imaging modalities to provide a more comprehensive evaluation of a patient’s condition. The decision to use a PET scan is made by a physician based on the specific clinical scenario and the potential benefits outweighing the risks and costs associated with the procedure.

1.1.5 Ultrasound (US) Images

US, also known as a sonogram, utilizes high-frequency waves to visualize internal body structures. In this medical imaging technique, a transducer emits high-frequency sound waves and captures the returning echoes from internal structures to create images. Some of the sound waves that are transmitted into the body are reflected back to the transducer as they cross a line separating various types of tissue. The transducer receives these echoes, converts them into electrical signals, and transmits them to a computer for analysis, resulting in high-resolution images of the body’s internal structures. Depth information is obtained by measuring the time delay between the emitted US wave and the reflected wave from each layer of the sample, enabling the creation of 2D- or 3D images. US is frequently used to monitor the growth of unborn babies, providing real-time imaging. It is typically unsuitable for imaging bones or air-containing tissues such as lungs but finds extensive application in abdominal, vascular, and thyroid examinations. In abdominal imaging, US is commonly used to examine the organs of the digestive system, including the gallbladder, pancreas, spleen, and kidneys. As a medical imaging technique, US has a number of benefits. The portability of the tools makes it possible to do imaging procedures anywhere, not just at a hospital. Furthermore, employing US offers the benefits of speed and radiation-free imaging. As it does not utilize potentially harmful ionizing radiation, it is also favored over other imaging methods like X-ray and CT scans. The capability to produce real-time images also enables the dynamic evaluation of internal structures and functions. US imaging, however, has a number of drawbacks. The inability to picture structures under the surface or behind opaque media (such as bone or gas) is a major drawback. Aberrations like bone or air echoes can further distort images, making it difficult to make out inner structures. US images and the precision of the diagnosis may also be affected by the US technician’s level of expertise.

1.1.6 Colonoscopy

Virtual colonoscopy, also referred to as CT colonoscopy, allows medical professionals to examine the large intestine for colon or rectal cancer, as well as growths called polyps, as demonstrated in a study (Nguyen and Lee 2019). The principle behind colonoscopy is the insertion of a colonoscope into the rectum and through the colon. The colonoscope is a long, flexible tube equipped with a light source and a camera at the tip. It allows the physician to visualize the colon lining in real time and detect any abnormalities such as polyps, ulcers, inflammation, or tumors. During the procedure, the physician can also perform therapeutic interventions, such as polyp removal or tissue biopsies, if necessary. Figure 1.1g represents an example of colonoscopy. Colonoscopy is an effective diagnostic tool for identifying colorectal cancer, inflammatory bowel disease (e.g. ulcerative colitis and Crohn’s disease), and other conditions affecting the colon and rectum. Colonoscopy allows for the detection and removal of polyps, which are abnormal growths that may develop into cancer over time. Colonoscopy has a high diagnostic accuracy for colorectal cancer and other gastrointestinal (GI) conditions, helping to guide appropriate treatment plans. A colonoscopy is an invasive procedure that requires the insertion of a flexible tube into the rectum and colon, which can cause discomfort or pain for the patient. It is important to discuss the risks, benefits, and alternatives of colonoscopy with a healthcare professional to make an informed decision about the procedure based on individual circumstances and medical history.

1.1.7 Dermoscopy

Dermoscopy is a noninvasive imaging technique used to evaluate skin lesions. It is also known as dermatoscopy or epiluminescence microscopy. This process entails inspecting the skin with a dermatoscope, a portable instrument fitted with magnifying lenses and illumination. Dermoscopy allows dermatologists to observe skin structures and patterns that are not visible to the naked eye, assisting in the identification and treatment of various skin conditions, notably melanoma and other forms of skin cancer. The principle behind dermoscopy is the use of polarized or non-polarized light to illuminate the skin and magnify the surface and subsurface features. The dermatoscope eliminates the surface reflection of the skin, allowing for a clearer view of the deeper layers. Dermoscopy also utilizes liquid immersion or contact dermoscopy, in which a gel or oil is applied to the skin to eliminate the air gap between the dermatoscope and the skin, further improving image quality. Dermoscopy plays a crucial role in the early detection of melanoma, the deadliest form of skin cancer. It helps dermatologists assess the morphological characteristics of pigmented skin lesions and differentiate between benign and malignant lesions. It aids in distinguishing between different types of skin lesions and guiding appropriate treatment plans. In Figure 1.1e represents an example of a skin lesion. Dermoscopy can be used to monitor changes in skin lesions over time, particularly in individuals with a history of skin cancer or atypical moles. It helps identify any signs of progression or regression, aiding in the decision-making process for further management. Dermoscopy improves the diagnostic accuracy of skin lesions, particularly in differentiating between benign and malignant lesions. Dermoscopy is a noninvasive procedure that does not require any incisions or tissue sampling. It can be performed quickly and easily in an outpatient setting. Dermoscopy can reduce the need for unnecessary biopsy of benign lesions, leading to cost savings in healthcare resources. The quality and capabilities of the dermatoscope used can influence the accuracy and reliability of dermoscopic images. High-quality equipment with appropriate magnification and lighting is essential for optimal results.

1.1.8 Microscopic Images

Microscopic imaging refers to the use of microscopes to visualize and examine objects at a microscopic level. It involves the magnification of objects that are too small to be seen with the naked eye, allowing for a detailed examination of their structure and features. Science and medicine rely heavily on microscopic imaging for a wide range of purposes, including the study, diagnosis, and analysis of tiny materials, organisms, and other specimens. The principle behind microscopic imaging is based on the interaction of light or electrons with the sample being observed. There are different types of microscopes used for imaging, including optical microscopes and electron microscopes. Tissue samples are commonly obtained through a biopsy for examination, with small sections of the tissue subsequently stained and colored to reveal cellular features. Counterstains are employed to enhance the color, visibility, and contrast of the images. Such images are commonly utilized in cancer identification, with characteristics such as cell size, shape, and distribution typically scrutinized. Figure 1.1f represents an example of a microscopic image. Microscopic imaging is crucial in medical diagnostics and pathology. It allows for the examination of biopsy samples, the identification of cellular abnormalities, and the diagnosis of various diseases and conditions. Microscopic imaging techniques, such as optical microscopy, allow for nondestructive examination of samples. This means that samples can be observed without altering their structure or composition, enabling further analysis or experimentation. Advanced microscopic imaging equipment, such as electron microscopes, can be expensive to acquire and maintain. Each microscopic imaging technique has its specific limitations in terms of resolution, depth of field, and contrast. Choosing the appropriate imaging technique for a specific application is important to obtain accurate and reliable results.

1.1.9 Optical Coherence Tomography (OCT)

OCT is a noninvasive imaging method that utilizes light waves to produce detailed cross-sectional images of biological tissues. The principle behind OCT is based on the interference of light waves. The technique employs low-coherence interferometry, where a near-infrared light beam emitted by a light source is divided into two paths: a reference arm and a sample arm. The reference arm includes a mirror, while the sample arm directs the light toward the tissue under examination. The light reflected back from both arms is recombined, and interference occurs between the reference and sample beams. The interference pattern is detected and analyzed to reconstruct a depth-resolved image of the tissue. Through measuring the time delay of reflected light, OCT can ascertain the depth of various tissue layers or structures within the sample, which is then utilized to produce a detailed cross-sectional image of the tissue. OCT finds widespread application in diagnosing and treating various eye conditions such as macular degeneration, diabetic retinopathy, glaucoma, and retinal detachment. It provides detailed images of the layers within the retina, allowing for early detection and monitoring of eye diseases. Figure 2.1h represents an example of OCT of an eye. In cardiology, OCT is used to visualize and assess coronary arteries. It helps in diagnosing and characterizing coronary artery disease, detecting plaque buildup, and guiding interventions such as stent placement. OCT provides detailed, high-resolution images of tissue structures, enabling the visualization of fine details and abnormalities. It provides real-time imaging, allowing for immediate evaluation and assessment. The penetration depth of OCT is limited to a few millimeters, depending on the tissue being imaged. This restricts its application in imaging deeper tissues or structures. Advanced OCT systems can be expensive, limiting their accessibility in certain healthcare settings. However, the technology has become more widely available over time.

1.2 Datasets for Segmentation of Medical Images

Segmentation using deep learning methods requires high-quality and diverse datasets for training and evaluation. In this section, we discuss the datasets commonly used in the field of segmentation and their characteristics. Researchers can build and test their deep learning models with the help of these datasets. The performance and generalizability of the created models can be better understood by learning more about the qualities and challenges of these datasets. Domain specialists oversee the creation and annotation of these datasets. The details of each of these datasets are described in the subsections below, and metadata is provided in Table 1.1.

Table 1.1 Datasets for segmentation of medical images.

Dataset

Modality

Organ

Url

LiTS (Bilic, P., 2023)

CT

Liver

https://competitions.codalab.org/

KiTS (Heller et al.

2019

)

CT

Kidney

https://wiki.cancerimagingarchive.net/

Brats (Menze et al.

2015

)

MR

Brain

https://www.kaggle.com/brats2020-

training-data

LIDCIDRI (Armato et al.

2011

)

CT

Lung

https://wiki.cancerimagingarchive.net/LIDCIDRI

ISIC (Codella et al.

2018

; Tschandl et al.

2018

)

Dermoscopy

Skin

https://challenge.isic-archive.com/

BUSI (Al-Dhabyani et al.

2020

)

Ultrasound

Breast

https://scholar.cu.edu.eg/?q=afahm

KvasirSEG (Jha et al.

2020

)

Colonoscopy

Polyps

https://datasets.simula.no/kvasir-seg/

CVCClinicDB (Bernal et al.

2015

)

Colonoscopy

Polyps

https://www.kaggle.com/cvcclinicdb

1.2.1 Multimodal Brain Tumor Segmentation Challenge (BraTS) Dataset

The BraTS dataset is a widely used dataset for brain lesion segmentation that has been merged with the Medical Image Computing and Computer-Assisted Intervention (MICCAI) conference. The BraTS dataset is a vast compilation of different patients’ MRI scans of the brain. Different versions of the dataset may include a different number of images. However, hundreds of scenarios are frequently included for both training and testing. Multiple imaging modalities, including T1-weighted, T2-weighted, and contrast-enhanced T1-weighted images, are routinely included in each MRI scan in the BraTS collection. These images can be of any size, although they are often 3D volumes with spatial dimensions of 240 × 240 × 155 voxels or larger. The enhancing tumor core, peritumoral edema, and necrotic regions are all annotated in this dataset with their true labels. This enables the evaluation and comparison of segmentation algorithms on specific tumor components. Different types of cancers are present in this dataset, as are a wide range of tumor sizes and forms, as well as the need to precisely segment tumors in the presence of other brain structures. Several challenges on the Kaggle platform have centered on the BraTS dataset. The BraTS 2020 Kaggle Challenge is one noteworthy competition that attempted to progress the domain of brain lesion segmentation through deep learning techniques. A subset of the BraTS dataset was made available to challenge participants for practice and assessment.

1.2.2 LIDC-IDRI (Lung Image Database Consortium Image Collection) Dataset

The LIDC/IDRI database is a well-known dataset for lung lesion segmentation and diagnosis. It is a globally accessible online platform designed for developing, training, and assessing computer-assisted diagnostic (CAD) techniques aimed at the early detection and diagnosis of lung cancer. The 1018 cases in this data set represent the combined efforts of eight medical imaging companies and seven educational institutions. Images from a clinical thoracic CT scan and the results of a two-stage image annotation technique performed by four experienced thoracic radiologists are included for each individual in an Extensible Markup Language (XML) file. During the initial round of blinded reading, radiologists evaluated all CT scans separately and categorized lesions as “nodule > or = 3 mm,” “nodule < 3 mm,” or “non-nodule > or = 3 mm.” After the blinded-read phase, each radiologist evaluated his or her own mask and the masks of the other three radiologists to reach a conclusion. The LIDC/IDRI database contains a total of 2, 44, and 527 CT scan images. Each CT scan typically consists of a series of axial slices, resulting in a 3D volume representation of the lungs. The size of the CT images may vary, but they are usually high-resolution axial slices with dimensions of around 512 × 512 pixels.

1.2.3 LiTS (Liver Tumor Segmentation) Dataset

The LiTS dataset is a widely used dataset for liver lesion segmentation, focusing on the segmentation of liver tumors. The LiTS dataset focuses specifically on LiTS, providing CT scans that include both malignant and benign liver lesions. The dataset offers voxel-level annotations, enabling precise segmentation and analysis of liver tumors. It provides a diverse collection of cases, capturing variations in tumor size, shape, location, and appearance within the liver. Automatic segmentation of tumor lesions is difficult due to their heterogeneous and diffusive structure. There are 200 liver CT scans included in the LiTS dataset. There are 130 CT scans in the training set and 70 in the test set. Each CT scan contains multiple axial slices, resulting in a large number of images for analysis. The size of the CT images in the LiTS dataset can vary but are usually high-resolution axial slices with dimensions around 512 × 512 pixels. To encourage the study and development of techniques for segmenting liver lesions, the LiTS Challenge was created as a competition. The competition utilizes data and CT scan slices contributed by various clinical sites worldwide.

1.2.4 KiTS (Kidney Tumor Segmentation) Dataset

The KiTS dataset is a collection of medical imaging data specifically focused on KiTS. More than 400 thousand people are diagnosed with kidney cancer every year, and the most common method of therapy is surgery. There is presently a growing fascination with the correlation between tumor morphology and surgical results, leading to the development of advanced surgical planning methods. This interest stems from the considerable diversity observed in the morphology of kidney and renal tumors. The KiTS19 challenge, or the KiTS challenge 2019, was a contest centered on segmenting kidney tumors depicted in contrast-enhanced CT images.

The challenge aimed to encourage the development of accurate and robust algorithms for KiTS. Expertly labeled CT scans of the abdomen taken during the arterial phase were included in the data set, which included 300 individuals. 70%, or 210, were made available for use as a training set, while the remaining 30%, or 90, were saved for use as a test set. Automatic semantic segmentation of kidneys, renal tumors, and renal cysts is a hot topic in the medical imaging research community, and the KiTS23 challenge in 2023 will place teams against one another to see who can develop the most successful system in this area. KiTS23 is the third edition of the KiTS challenge, following previous competitions held in 2019 and 2021.

1.2.5 ISIC (International Skin Imaging Collaboration) Dataset

The ISIC dataset is commonly used for segmenting and classifying skin lesions. Segmentation and categorization of skin lesions, including malignant melanoma, benign lesions, and others, are the main topics of the ISIC dataset. It offers a diverse collection of images captured through dermoscopy, allowing for the exploration of different skin lesion characteristics, such as color, shape, texture, and size. The dataset provides ground truth annotations for lesion boundaries, enabling precise segmentation and analysis of skin lesions. The ISIC2018 dataset includes a subset specifically designed for lesion segmentation tasks. The ISIC2018 dataset for lesion segmentation consists of 2594 dermoscopic images. These images cover a variety of skin lesions, including melanoma, nevi, basal cell carcinoma, and other lesion types. The dataset enables researchers to develop and evaluate algorithms and models for automated lesion boundary delineation. The ISIC challenge is an integral part of the ISIC2018 dataset, providing a platform for researchers and data scientists to showcase their expertise in skin lesion analysis. The challenge invites participants to develop innovative solutions for various tasks, including lesion segmentation, classification, and detection. It provides an opportunity for researchers and data scientists to evaluate and compare their algorithms against a standardized benchmark, fostering collaboration and knowledge sharing in the field. By tackling the difficult problems of precise segmentation, classification, and detection of skin lesions, the ISIC Challenge hopes to improve the field of automated skin lesion analysis. Participants are encouraged to develop algorithms that can assist clinicians in diagnosing and managing skin conditions, ultimately improving patient care and outcomes.

1.2.6 BUSI (Breast Ultrasound) Dataset

The BUSI dataset is commonly employed for segmenting breast lesions in US images. Breast cancer stands as one of the primary causes of female mortality worldwide. However, death rates can be lowered with early detection. The BUSI dataset is categorized into three groups: images depicting normal, benign, and malignant conditions. Utilizing machine learning or deep learning in conjunction with BUSI images can yield promising outcomes in the classification, detection, and segmentation of breast cancer. The baseline data collection involved obtaining BUSI images from women aged 25 to 75 years. Six hundred female patients were included in the 2018 data set. The collection contains 780 Portable Network Graphics (PNG) images with a standard size of 500 × 500 pixels. Grayscale is the most common color scheme for US images. They were gathered and saved in the Digital Imaging and Communications in Medicine (DICOM) format at Baheya Hospital. The scanning procedure employed the LOGIQ E9 US system and the LOGIQ E9 Agile US system. These state-of-the-art instruments are commonly utilized for high-quality imaging in radiology, cardiac, and vascular applications. They offer an image resolution of 1280 × 1024 pixels. The ML6−15−D Matrix linear probe utilized transducers with a frequency range of 1−5 MHz. To enhance the usefulness of the dataset, several necessary tasks were undertaken. The dataset contained duplicated images, which were subsequently eliminated. Additionally, the annotations were thoroughly reviewed and corrected by radiologists from Baheya Hospital. To facilitate compatibility, the DICOM images were converted to PNG format using a specialized DICOM converter application. The dataset provides annotations or ground truth masks for segmenting breast lesions, enabling the evaluation of segmentation algorithms. The dataset includes US images acquired from different medical centers, introducing variability in imaging conditions, equipment, and patient populations.

1.2.7 Colonoscopy Datasets

The human GI tract consists of various segments, including the large bowel, which can be affected by different anomalies and diseases, including colorectal cancer. Both women and men are at increased risk for developing colorectal cancer, making it the third most frequent form of cancer overall. Polyps, which are pre-cursors to colorectal cancer, are found in almost half of individuals aged 50 or above during screening colonoscopies, and their prevalence increases with age. Colonoscopy is considered the standard method for detecting and assessing these polyps, followed by biopsy and removal. Early detection of diseases significantly impacts the survival rate for colorectal cancer, making polyp detection crucial. The chance of developing colorectal cancer can be reduced by improving polyp detection. Therefore, early automated diagnosis of polyps can help prevent colorectal cancer and increase survival rates. There are multiple datasets available for colonoscopy-related research, such as the Computer Vision Center (CVC) CVC-ClinicDB, CVC-ColonDB, Kvasir-SEG, and ETIS-Larib datasets. We will briefly discuss some of them here:

1.2.7.1 Kvasir-SEG

The Kvasir-SEG dataset is tailored specifically for the segmentation of polyps in GI endoscopy images. It offers freely accessible GI polyp images accompanied by their respective segmentation masks. These annotations have been meticulously crafted and validated by an expert gastroenterologist. The dataset, with a size of 46.2 MB, comprises 1000 polyp images alongside their corresponding ground truth sourced from the Kvasir Dataset v2. The images within Kvasir-SEG exhibit varying resolutions, spanning from 332 × 487 to 1920 × 1072 pixels. Organized into two distinct folders, the dataset features image files and their associated masks sharing identical filenames. The image files utilize JPEG compression for efficient storage and seamless online browsing capabilities. Researchers and educators can easily access and download this open-access dataset for research and educational purposes. The Kvasir-SEG dataset serves as a valuable resource for researchers and developers aiming to advance the field of polyp segmentation, detection, localization, and classification. With the need for multiple datasets to compare computer vision algorithms, this dataset can be used effectively for both training and validation purposes. It enables the development of cutting-edge solutions specifically tailored to colonoscopy images captured by various manufacturers. Further exploration in this area holds the potential to reduce the polyp miss rate and enhance the quality of examinations. Moreover, the versatility of the Kvasir-SEG dataset extends beyond its primary focus, making it suitable for general segmentation and bounding box detection research. Its availability alongside numerous datasets from diverse fields, including medical and nonmedical domains, adds to its value and applicability.

1.2.7.2 CVC-ClinicDB and CVC-ColonDB