Tools, Languages, Methodologies for Representing Semantics on the Web of Things -  - E-Book

Tools, Languages, Methodologies for Representing Semantics on the Web of Things E-Book

0,0
126,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

This book is a guide to the combination of the Internet of Things (IoT) and the Semantic Web, covering a variety of tools, technologies and applications that serve the myriad needs of the researchers in this field. It provides a multi dimensional view of the concepts, tools, techniques and issues that are involved in the development of semantics for the Web of Things.

The various aspects studied in this book include Multi-Model Multi-Platform (SHM3P) databases for the IoT, clustering techniques for discovery services for the semantic IoT, dynamic security testing methods for the Semantic Web of Things, Semantic Web-enabled IoT integration for a smart city, IoT security issues, the role of the Semantic Web of Things in Industry 4.0, the integration of the Semantic Web and the IoT for e-health, smart healthcare systems to monitor patients, Semantic Web-based ontologies for the water domain, science fiction and searching for a job.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 378

Veröffentlichungsjahr: 2022

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.


Ähnliche


Table of Contents

Cover

Title Page

Copyright Page

Preface

1 The Role of Semantic Hybrid Multi-Model Multi-Platform (SHM3P) Databases for IoT

1.1. Introduction

1.2. Databases for multi-model data

1.3. Platforms

1.4. Variations of SHM3P DBMS

1.5. What are the benefits of SHM3P databases for IoT?

1.6. Summary and conclusions

1.7. References

2 A Systematic Review of Ontologies for the Water Domain

2.1. Introduction

2.2. Literature review

2.3. Applications of ontologies in the water domain

2.4. Discussion and conclusion

2.5. References

3 Semantic Web Approach for Smart Health to Enhance Patient Monitoring in Resuscitation

3.1. Introduction

3.2. Background

3.3. IoT Smart Health applications and semantics

3.4. Proposed approach and implementation

3.5. Conclusion

3.6. References

4 Role of Clustering in Discovery Services for the Semantic Internet of Things

4.1. Introduction

4.2. Discovery services in IoT

4.3. Semantic-based architectures

4.4. Discovery services and clustering

4.5. Clustering methods in IoT

4.6. Conclusion

4.7. References

5 Dynamic Security Testing Techniques for the Semantic Web of Things

5.1. Introduction

5.2. Related studies

5.3. Background of dynamic security testing techniques

5.4. DAST using static analysis

5.5. DAST using user session

5.6. DAST using Extended Tainted Mode Model

5.7. Current issues and research directions

5.8. Conclusion

5.9. References

6 SciFiOnto

6.1. Introduction

6.2. Literature survey

6.3. Modeling and evaluation of the ontology

6.4. Automatic Knowledge Acquisition model

6.5. Conclusion

6.6. References

7 Semantic Web-Enabled IoT Integration for a Smart City

7.1. Introduction: Semantic Web and sensors

7.2. Motivation and challenge

7.3. Literature review

7.4. Implementation of forest planting using SPARQL queries

7.5. Conclusion

7.6. References

8 Heart Rate Monitoring Using IoT and AI

8.1. Introduction

8.2. Literature survey

8.3. Heart rate monitoring system

8.4. Results and discussion

8.5. Conclusion and future works

8.6. References

9 IoT Security Issues and Its Defensive Methods

9.1. Introduction

9.2. IoT security architecture

9.3. Specific security challenges and approaches

9.4. Methodologies used for securing the systems

9.5. Conclusion

9.6. References

10 Elucidating the Semantic Web of Things for Making the Industry 4.0 Revolution a Success

10.1. Introduction

10.2. Correlation of the Semantic Web of Things with IR4.0

10.3. Smart manufacturing system and ontologies

10.4. Literature survey

10.5. Conclusion and future work

10.6. References

11 Semantic Web and Internet of Things in e-Health for Covid-19

11.1. Introduction

11.2. Dataset

11.3. Application of IoT for Covid-19

11.4. Semantic Web applications for Covid-19

11.5. Limitations and challenges of IoT and SW models

11.6. Discussion

11.7. Conclusion

11.8. References

12 Development of a Semantic Web Enabled Job_Search Ontology System

12.1. Introduction

12.2. Review of the related work done for online recruitment

12.3. Design of “SearchAJob” ontology for the IT domain

12.4. Implementing the proposed ontology

12.5. Benefits of Semantic Web enabled SearchAJob system

12.6. Conclusion and future scope

12.7. References

List of Authors

Index

Other titles from iSTE in Interdisciplinarity, Science and Humanities

End User License Agreement

List of Tables

Chapter 2

Table 2.1.

Sources to collect the related research

Table 2.2.

A comprehensive review of existing research

Table 2.3.

Selected ontologies for the water domain

Table 2.4.

Ontology applications in water domains

Table 2.5.

Reused concepts in existing water ontology

Chapter 5

Table 5.1.

The state-of-the-art comparison of existing works. P1: static ana

...

Chapter 6

Table 6.1.

Class hierarchy

Table 6.2.

RDF schema of the ontology

Table 6.3.

Quantitative data of the modeled Sci-Fi ontology

Table 6.4.

Qualitative evaluation

Table 6.5.

Hybrid evaluation of the manually conceived Sci-Fi domain ontolog

...

Table 6.6.

Algorithm for RDF mapping using the Binomial Deep Neural Network

...

Table 6.7.

Evaluation results for Automatic Knowledge Base Generation

Table 6.8.

Quantitative aspects of the obtained automatically generated onto

...

Chapter 8

Table 8.1.

Comparison of literature review

Chapter 9

Table 9.1.

Properties and requirements of centralized and distributed approa

...

Chapter 10

Table 10.1.

Semantic Web technologies in IR4.0

Table 10.2.

Future scope of SWoT

Chapter 12

Table 12.1.

Requirement of online recruitment system

List of Illustrations

Chapter 1

Figure 1.1.

Various data models used in today’s companies.

Figure 1.2.

SHM3P database spanning over multiple platforms. Here, an SHM3P

...

Figure 1.3.

Different types of DBMS running on different platforms and hardw

...

Figure 1.4.

Hardware architectures and properties. Figure is based on Groppe

...

Figure 1.5.

Taxonomy of different types of databases toward SHM3P databases.

Chapter 2

Figure 2.1.

Three-step methodologies to conduct systematic review

Figure 2.2.

Parameters covered by conducted studies

Chapter 3

Figure 3.1.

An overview of proposed approach

Figure 3.2.

Ontology class hierarchy.

Figure 3.3.

Class hierarchy, object/data properties and individuals of devel

...

Figure 3.4.

Binary relationships diagram.

Figure 3.5.

Query all patients who are older than 60 years and have hospital

...

Figure 3.6.

Rules editor

Figure 3.7.

SWRLTab and Rules execution

Chapter 4

Figure 4.1.

Usecase highlighting the importance of directory-based discovery

...

Figure 4.2.

Attributed-based distributed directory discovery service archite

...

Figure 4.3.

Requirements for designing clustering methods in the IoT and com

...

Chapter 5

Figure 5.1.

Black Box testing

Figure 5.2.

Pro and cons of Dynamic Application Security Testing

Figure 5.3.

How SAST works with DAST

Figure 5.4.

The workings of DAST with user session

Figure 5.5.

The process of generation of test cases

Figure 5.6.

Workings of the Tainted Model

Chapter 6

Figure 6.1.

Entity graph for a sample class.

Figure 6.2.

Ontology visualization of a snippet of the SciFiOnto.

Figure 6.3.

Qualitative evaluation datasheet.

Figure 6.4.

System architecture of RDF-driven automatic ontology generator t

...

Figure 6.5.

Entity graph for one of the classes.

Figure 6.6.

Proposed Binomial Neural Network Architecture

Figure 6.7.

Hybrid evaluation results chart.

Chapter 7

Figure 7.1.

Conceptual diagram of forest planting space

Figure 7.2.

Active ontology URI.

Figure 7.3.

Classes of PlantingSpace ontology.

Figure 7.4.

Object properties.

Figure 7.5.

Data properties.

Figure 7.6.

Forest planting using IoT system flow

Figure 7.7.

Apache Jena Fuseki with SPARQL query.

Figure 7.8.

Dashboard.

Figure 7.9.

SPARQL result of PARK.

Chapter 8

Figure 8.1.

The layout of the Arduino UNO development board

Figure 8.2.

System architecture

Figure 8.3.

System flow chart

Figure 8.4.

a) Proposed Prototype System with entire setup and electronic co

...

Figure 8.5.

Visualizations and storage of a) recorded heartbeat, b) temperat

...

Figure 8.6.

Case – 1: Recorded a) heartrate and b) temperature are normal.

...

Figure 8.7.

Case – 2: Recorded a) heartrate and b) temperature are high (not

...

Figure 8.8.

Visualizations (in mobile) of recorded a) heartbeat, b) temperat

...

Chapter 9

Figure 9.1.

Basic security issues (Shea 2021)

Figure 9.2.

Typical IoT security architecture (Zhang et al. 2019).

Figure 9.3.

Centralized IoT architecture (Aggarwal et al. 2021).

Figure 9.4.

Distributed IoT architecture (Aggarwal et al. 2021).

Figure 9.5.

IoT security architecture showing trust zones and boundaries (Ca

...

Figure 9.6.

Azure IoT security architecture using threat modeling (Shahan et

...

Chapter 10

Figure 10.1.

Versions of the Industrial Revolution.

Figure 10.2.

Industrial Revolution 4.0 and its related technologies.

Figure 10.3.

Key components of smart machines.

Figure 10.4.

Semantic Web of Things schematic showing the end users and appl

...

Figure 10.5.

Definition of ontology

Figure 10.6.

Semantic Web-based solution for end-to-end integration

Figure 10.7.

Nine pillars of IR4.0.

Chapter 12

Figure 12.1.

IT jobs hierarchy

Figure 12.2.

Data flow diagram of the job search process

Figure 12.3.

Class – instance description of ontology in Protégé.

...

Figure 12.4.

IT job ontology proposed structure

Figure 12.5.

Architecture of ontology

Figure 12.6.

Select query execution on the server.

Figure 12.7.

Insert query execution on the server.

Figure 12.8.

Delete SPARQL query on the SearchAJob dataset.

Figure 12.9.

Including EasyRdf library in PHP.

Figure 12.10.

Execution of insert query using EasyRdf library in PHP.

Figure 12.11.

SearchAJob system portal developed in PHP.

Figure 12.12.

SearchAJob system employer’s dashboard.

Figure 12.13.

SearchAJob system jobseeker’s dashboard.

Figure 12.14.

Job offers posting by employer.

Figure 12.15.

Job searching by jobseeker.

Guide

Cover Page

Title Page

Copyright Page

Preface

Table of Contents

Begin Reading

List of Authors

Index

Other titles from iSTE in Interdisciplinarity, Science and Humanities

Wiley End User License Agreement

Pages

iii

iv

xi

xii

xiii

xiv

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

237

238

239

241

242

243

244

Series EditorPatrick Siarry

Tools, Languages, Methodologies for Representing Semantics on the Web of Things

Edited by

Shikha MehtaSanju TiwariPatrick SiarryM.A. Jabbar

First published 2022 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc.

Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address:

ISTE Ltd27-37 St George’s RoadLondon SW19 4EUUK

John Wiley & Sons, Inc.111 River StreetHoboken, NJ 07030USA

www.iste.co.uk

www.wiley.com

© ISTE Ltd 2022The rights of Shikha Mehta, Sanju Tiwari, Patrick Siarry and M.A. Jabbar to be identified as the authors of this work have been asserted by them in accordance with the Copyright, Designs and Patents Act 1988.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s), contributor(s) or editor(s) and do not necessarily reflect the views of ISTE Group.

Library of Congress Control Number: 2022940900

British Library Cataloguing-in-Publication DataA CIP record for this book is available from the British LibraryISBN 978-1-78630-764-4

Preface

The digital revolution has led to the exponential proliferation of data across all domains. It is expected to mushroom further with the mounting utility of Internet of Things (IoT) devices. IoT is a technological revolution that provides the basic structure for next generation applications and services for routine activities, by integrating a huge number of distributed and heterogeneous devices. However, interoperability is the key challenge faced by contemporary IoT applications, due to proprietary formats and a lack of standardization. Interoperability among heterogeneous devices can be attained by developing formal semantic representations and technologies with appropriate abstraction levels. Semantic Web technologies such as ontologies, semantic web services, annotations, etc., are seen as potential solutions to solve the data interoperability issue for real-world problems.

The amalgamation of IoT with the Semantic Web has given rise to a new dimension that is the Semantic Web of Things (SWoT). The efficacy of Semantic Web technologies has already been established in diverse application domains. Developments in the field of IoT interoperability are still infant. Thus, there is immense scope for deploying the concepts, tools and technologies of the Semantic Web to handle the issues prevailing in IoT applications. The SWoT would help in the development of knowledge-based systems with enormous autonomic competence for the storage and management of information, discovery of devices, etc., for diverse applications. Semantic analytics in IoT applications is an emerging trend for discovering useful patterns to develop new business strategies.

This book focuses on the design and development of tools, technologies, frameworks, architectures and applications of the SWoT to handle the challenges posed in various applications. The objective is to accentuate the usability and performance of these techniques in dealing with problems in emerging areas. The book aims to include submissions that present innovative techniques, cutting-edge systems and novel applications supported by experimental results that reflect the current advances in these domains.

This book will serve as a research guide for graduate students and researchers in the field. The concepts, tools and technologies discussed in this book will assist practitioners in understanding the latest developments in the field. The theme of the book is in sync with the advanced technologies, systems and applications, which can be of immense benefit when solving research problems in diverse domains.

The work presented in this book is divided into 12 chapters. All of these chapters cover abundant concepts and applications from the semantic IoT perspective, such as the role of Semantic Hybrid Multi-Model Multi-Platform (SHM3P) Databases for the IoT, the role of clustering in device discovery for semantic IoT, Industry 4.0, patient monitoring, job search, ontologies in the water domain, smart cities, heart rate monitoring and security.

Chapter 1, The Role of Semantic Hybrid Multi-Model Multi-Platform (SHM3P) Databases for IoT provides a model for integrating DBMS (Database Management Systems) across a variety of platforms and applications. A semantic layer is glued between the platforms and subsystems to design multi-platform models. This chapter also elaborates on the need for additional computing and storage capabilities, robustness and integration with mobile applications.

Chapter 2, A Systematic Review of Ontologies for the Water Domain focuses on the Semantic Web technologies for water resources management. This chapter provides a systematic review of ontologies for water resources along with their features and applications.

Chapter 3, Semantic Web Approach for Smart Health to Enhance Patient Monitoring in Resuscitation applied Semantic Web technologies to monitor resuscitation patients. This work integrated Semantic Web tools with the IoT to automatically monitor the patients for better healthcare, reduce errors and improve patient experience. This chapter proposes a knowledge representation and reasoning framework to semantically annotate data to analyze the semantics of vital signs monitors and data that come from them.

Chapter 4, Role of Clustering in Discovery Services for the Semantic Internet of Things presents a detailed study of various discovery service architectures in the IoT. The discovery services are mainly categorized into three types – directory-based, directory-less and semantic-based. Semantics play an important role in building intelligent IoT systems. This chapter highlights the importance of the clustering of services and devices for designing discovery services, in order to reduce search space and look up time.

Chapter 5, Dynamic Security Testing Techniques for the Semantic Web of Things: Market and Industry Perspective presents the algorithms and approaches designed by researchers for dynamic security testing by including taint analysis, static analysis, etc., to improve the quality of test cases.

Chapter 6, SciFiOnto: Modeling, Visualization and Evaluation of Science Fiction Ontologies Based on Indian Contextualization with Automatic Knowledge Acquisition models ontologies for an interesting domain: science fiction. This model incorporates the extensive vocabulary developed by referring to science fiction literature based on Generation Z or millennials. This chapter presents an integrated framework based on a Binomial Deep Neural Network to densely populate the entities in the existing Science Fiction Ontology.

Chapter 7, Semantic Web-Enabled IoT Integration for a Smart City focuses on assessing the Forest Planting Ontology (FPO) using the IoT in smart cities such as New York. The performance of these ontologies has been evaluated using multiple test cases with tools such as Protégé and Apache Jena Fuseki.

Chapter 8, Heart Rate Monitoring Using IoT and AI applies IoT and Artificial Intelligent (AI) techniques to monitor the heart rate of patients. The chapter presents a small chip-like portable IoT device for heart patients. The device continuously monitors the temperature and heartbeat of patients and sends these signals to the cloud for timely interventions.

Chapter 9, IoT Security Issues and Its Defensive Methods discusses IoT security architectures for various layers. It also highlights the blockchain-based IoT security system and emphasizes its importance for IoT device management. IoT devices connected to the blockchains database would be protected by a device identification-based key mechanism. In the future, multilayer architecture for distributed and centralized IoT with blockchain technology may be developed.

Chapter 10, Elucidating the Semantic Web of Things for Making the Industry 4.0 Revolution a Success collates the latest developments in the SWoT that have contributed significantly in the accomplishment of Industry Revolution 4.0. It is observed that the majority of the advancements have been made in manufacturing engineering and product engineering domains.

Chapter 11, Semantic Web and Internet of Things in e-Health for Covid-19 highlights the utility of the Semantic Web and the IoT to develop e-health applications for the prediction of Covid-19. This chapter presents an in-depth analysis of the various frameworks and architectures designed by integrating the Semantic Web with the IoT.

Chapter 12, Development of a Semantic Web Enabled Job_Search Ontology System presents a “SearchAJob” system using Semantic Web technologies. The job search ontology provides the benefit of executing a single query across multiple domains, in order to extract information regarding job recruitment opportunities from multiple ontologies with a single click.

Shikha MEHTASanju TIWARIPatrick SIARRYM.A. JABBARMay 2022

1The Role of Semantic Hybrid Multi-Model Multi-Platform (SHM3P) Databases for IoT

To overcome the difficulties of today’s zoo of data models stored and processed in companies, multi-model databases offer a simple way to access and query the data stored using different models. In contrast to other data models, the semantic model introduces an additional abstraction layer for reasoning purposes, offering superior possibilities for data integration. Hence, the semantic model is best suited to act as a glue between different data models. Today’s companies are using various platforms such as mobile devices, web, desktops, servers (hardware-accelerated by GPU (Graphical Processing Unit), FPGA (Field-programmable gate array) and in the future, QPU), clouds and post-clouds (e.g. fog and edge computing) to run their applications and databases. In this chapter, we discuss the possibilities for the Internet of Things (IoT) of so called semantic hybrid multi-model multi-platform databases, which use semantic technologies as glue to integrate different data models and run on various platforms, offering the best features of the various data models and platforms.

1.1. Introduction

Today companies use data in various data formats (see Figure 1.1). Web shops are connected with relational databases containing customers and their orders. To exchange data, product catalogs of companies are serialized and transmitted in the XML, JSON or RDF data formats. Graph data is frequently processed due to the importance of social networks today. Unstructured data dominates in social media, like in wikis. Due to their simple way of retrieving the data by just using keys, key-value stores are widely used. Schema-free or schema-less databases are preferred ways to store unstructured data, because they do not require the data to stay in the inflexible corset of a schema. Document stores even support complex data formats under the absence of schemas. The data are hence stored according to and processed using different models (multi-model data (see Lu and Holubová (2019))). The big challenge for today’s companies are the synchronization and integration of their multi-model data into a single view of and for the customer (see Kotorov (2003)). Multi-Model Database Management Systems (MM-DBMSs) offer the management of different data models in one single database (see Lu and Holubová (2019)). The alternative architecture principle is polyglot persistence, where applications use several databases at the same time to handle multi-model data (Leberknight 2008). The big disadvantages are that it inherits the limitations of different databases, for example queries and rules are only optimized within one database, but not across connected Database Management Systems (DBMSs) (see Groppe and Groppe (2020) and Groppe (2021)). In Groppe (2021), we propose to use the semantic data model for unifying the other data models: the semantic data model supports ontologies as an additional abstraction layer, which best suits the data integration purposes of other data models.

Figure 1.1.Various data models used in today’s companies.

Traditionally mainly running on parallel servers, today DBMSs are operating on various different platforms such as mobile devices, web, desktops, servers (maybe additionally hardware accelerated by GPUs, FPGAs and emergent technologies such as quantum computing), clouds and post-clouds (e.g. fog and edge computing) offering execution environments to run DBMSs1.

Recent trends in programming languages like Kotlin (see JetBrains s.r.o. (2020)) include multi-platform development support to share common code between different platforms such as desktop, server, web, mobile and IoT. In this way the development costs are drastically reduced for a DBMS running on multiple platforms. For example, the Semantic Web DBMS luposdate3000 developed in Kotlin is designed to run fast on parallel servers utilizing the Java virtual machine (JVM) (see Warnke et al. (2021)), and also offers a web app that runs completely in the web browser (see Groppe et al. (2021a,b)). Furthermore, another target platform is the IoT, where luposdate3000 is running on the edge (see Warnke (2022)) with efficient indexing schemes, as proven by experiments with the simulator SIMORA (see Warnke et al. (2022)).

By connecting all of the pieces together (M3P/HM3P/SHM3P), DBMSs are defined as follows (see Groppe (2021)):

DEFINITION 1.1 (M3P/HM3P/SHM3P DBMS).–

A multi-model multi-platform database management system (M3P DBMS) is an MM-DBMS that can be executed on different platforms. A hybrid M3P (HM3P) DBMS spans over different platforms in operation. A Semantic HM3P (SHM3P) DBMS supports a (global) semantic layer (for querying and reasoning purposes) over all platforms of an HM3P DBMS.

Today’s M3P DBMSs are usually developed for platforms of the same type (like servers running windows or linux, see Groppe and Groppe (2020) and Groppe (2021)). Only few of them support hybrid clouds, which integrate a (locally installed) private cloud with a public cloud2. In contrast, we envision SHM3P DBMSs operating over platforms of different types (such as IoT and hardware-accelerated parallel servers). In this way the features of different types of databases developed for different platforms can be supported (such as energy-saving on IoT devices and high throughput on servers). For Semantic DBMS, advanced global reasoning capabilities spanning over all platforms need to be developed. Hence, SHM3P databases support any data model at any platform by tightly integrating them with a semantic layer (see Groppe (2021)). For an example installation, see Figure 1.2.

Figure 1.2.SHM3P database spanning over multiple platforms. Here, an SHM3P database replaces an IoT database in an Industry 4.0 scenario (using edge-computing), a GPU-accelerated parallel database (on a parallel server) for archiving and generating long-term statistics of the IoT data, which is further supported by a quantum computer for query and reasoning optimization, a database in the cloud for natural language processing tasks and a mobile database (on mobile devices and infrastructure) for monitoring and controlling the production line in the company. Platforms are marked using italic font. Green text marks discussion about reasoning in these scenarios

(source: Groppe (2021)).

1.2. Databases for multi-model data

Polyglot persistence uses different databases supporting different data models (and maybe running on different platforms) within one application (see Leberknight (2008)). There is a need for federated query languages to formulate queries over heterogeneous data stores within one single query. Examples for federated query languages include the following:

– CloudMdsQL (see Kolev et al. (2016)), which can be used to formulate queries over SQL and NoSQL databases integrated in a prototype with the support of global optimization, and push operations down to the integrated SQL and NoSQL databases as much as possible.

– Zhu and Risch (2011) propose a system to query cloud-based NoSQL such as Google’s Bigtable and relational databases with the Google Bigtable query language GQL.

– Apache Drill3 supports interactive ad hoc analysis of large-scale datasets with low-latency handling up to petabytes of data spread across thousands of servers. Drill’s optimization techniques include leveraging the datastore’s internal processing capabilities in query plans and considering data locality for best query performance.

The integration of diverse data sources by using database connectors (like JDBC drivers) is widely used in commercial multi-store products such as IBM BigInsights, Microsoft HDInsight and Oracle Bigdata Appliance, as well as in open source projects like PrestoDB4. The semantic integration is done in Tatooine (see Bonaque et al. (2016)) using a semantic layer as glue between databases for different data models. However, data processing is limited in all of these polystores, because they do not fully support the optimization of queries across the integrated, but independent data sources.

There is a long history of federation databases (see Hammer and McLeod (1979)) and multi-databases (see Smith et al. (1981)). Their architectures contain a mediator between different autonomous databases. This mediator integrates different databases and data sources by reformulating queries according to a global schema. The reformulated queries follow the native schemes of the integrated databases evaluating these queries. Today, some research efforts about federating databases follow the polyglot persistence approach: for example, DBMS+ (see Lim et al. (2013)) provides unified declarative processing for the integration of several processing and database platforms. Location transparency is offered by BigDAWG (see Elmore et al. (2015)), while running queries against its different integrated systems PostgreSQL, SciDB and Accumulo.

Multi-Model Databases: A multi-model database is one single database for multiple data models, which fully integrates a backend to offer advanced performance, scalability and fault tolerance (see Lu et al. (2018)). Object-Relational DataBase Management Systems (ORDBMSs) were one of the first of this type supporting various data models such as relational, text, XML, spatial and object. ORDBMSs are based on relational databases and other data models are pressed into the relational data model for the purpose of integration. The relational model is the first-class citizen. In comparison and in general, in multi-model databases the different models can be all first-class citizens and are supported in a native way (utilizing, e.g. specialized indices for them). Holubova and Scherzinger (2020) propose the use of a semantic layer as glue between the different data models, in order to support global querying and reasoning over all data. We extend this idea to multi-platform databases integrating the technologies and features of different types of databases.

An overview of current state-of-the-art multi-model databases is provided in Groppe and Groppe (2020). Groppe (2021) contains a discussion about the importance of Semantic Web data in multi-model databases: while the support of graph data seems to be quite popular (12 from 21 MM DBMS), only five support RDF as a data model, but do not support the reasoning at all, or only in a rudimentary way. Users and applications with reasoning demands hence rely on native semantic DBMS. W3C (2001) contains a selection of 18 widely used native Semantic Web tools including triple stores and Semantic Web databases. For over half of these tools, Java is the dominating programming language (i.e. six of these tools run on any platform with Java support or four of these tools support Java language bindings). Semantic Web tools with native binaries usually run on any desktop and server computers, some only on linux operating systems. Most multi-model databases run SQL, SQL-like or extensions of SQL queries. Binaries of these databases are offered in machine code (often compiled from C/C++) or for the JVM. They usually run on all or a big subset of the major server operating systems: Linux, Windows, macOS, Unix and their variants. Few multi-model databases like IBM DB2 still operate on mainframes operating, for example z/OS. While all offer to run in the cloud, some are also enabled for the hybrid cloud.

An HM3P DBMS extends the idea of multi-model databases and supports multiple types of platforms like main-memory, cloud, IoT (with, for example, edge computing) and hardware-accelerated databases using their different advantages at runtime for database tasks such as data distribution, transaction handling and query processing. An SHM3P DBMS supports semantic layers as glue between the different data models, and supports global semantic querying and reasoning by tightly integrating local query engines and reasoners.

There is a need to integrate the data from different types of databases running on different platforms like in the following scenario: for example, we may combine the data of IoT devices (stored in an IoT database running on the edge of the network) with the accounting data containing the remaining time for charging off (stored in a main memory database). For an advanced processing of different types of data stored in different databases and other database tasks, it is indispensable to break the boundaries of the single installations of these DBMSs and run one single DBMS. Furthermore, a semantic layer between different databases helps advanced processing and reasoning capabilities and a tight integration of the different data models. This would also allow us to offer the best features of the different types of databases to applications and users “under one roof” transparently or with an intelligent integration into one query language and Application Programming Interface (API). According to Groppe (2021), this single SHM3P DBMS installation runs over all platforms at the same time, offering the advantages of all the different types of DBMSs (to the data that has been previously processed by the single installations) tightly integrated in a semantic layer, but to have, for example, a global optimization of data distribution, transaction handling and global queries and reasoning tasks with full potential by having freedom of processing down to the physical layer (e.g. index accesses)5. One obvious effect of developing one single SHM3P DBMS is reducing the costs of applications and periods of vocational adjustment of developers by offering one API and query language with an additional semantic layer for all different platforms. For SHM3P DBMSs it is very challenging, and also very promising to provide a global distributed reasoner integrating different types of reasoners on different platforms. The advantage of a global reasoner is global optimization for a heterogeneous environment based on a global cost model considering different costs, for example, communication, processing and lifetime of IoT devices.

1.3. Platforms

Databases have been developed for many different platforms. Please see Figure 1.3 for a taxonomy of different types of DBMS running on different types of platforms. In this section, we briefly introduce the different platforms running execution environments for different types of DBMSs.

Small- to medium-sized enterprises (SMEs) usually buy, deploy and run their own server platforms for their database servers. These DBMSs are typically centralized parallel databases utilizing multi-core and sometimes many-core systems, often in virtual machines. Server platforms are the dominating platforms not only for relational DBMSs, but also for most Semantic Web DBMSs and reasoners. All other types of DBMSs, including the distributed ones, usually offer a local mode to run on a single server.

Figure 1.3.Different types of DBMS running on different platforms and hardware technologies.

The massive parallelism of special hardware behind today’s multi-core CPUs like many-core systems including GPUs, FPGAs and in the future quantum computers speeds up databases in hardware-accelerated servers. Figure 1.4 provides an overview of the different types of hardware accelerators and their properties6. The features of multi-core CPU are as follows:

– shared memory for all cores;

– caches in each core for faster accesses to the main memory, offering cache-coherency over all cores;

– a high single-core performance;

– threads running different code (according to the multiple-instruction multiple-data (MIMD) paradigm).

Figure 1.4.Hardware architectures and properties. Figure is based on Groppe and Groppe (2020) and extended by the “Reasoning” and “Machine Learning” rows.

Many-core CPUs offer many more cores for a higher throughput, with the drawback of an increase of latency and lower single thread performance. Besides these differences, they are based on a similar architecture like that of multi-core CPUs.

Modern GPUs are a special form of many-core systems with several thousands of computing cores following the single-instruction multiple-data paradigm, that is, the same instruction is executed on different data on different cores at the same time. Hence, only certain parallel algorithms like those, where all possibilities are enumerated to find the best one (such as in query optimization and multi-version concurrency control (MVCC)), benefit from GPUs. GPUs offer a high memory bandwidth, such that they are best suited for parallel data-intensive algorithms, which process different (disjoint) subsets of data in parallel, like join algorithms especially designed for GPUs (see, for example, Zhang et al. (2019), who deal with joins for SPARQL processing on GPUs).

FPGAs offer the special feature of reconfiguration of interconnects to connect programmable logic blocks on the FPGA with each other. As a direct consequence, FPGAs are ideally suitable for data-flow-driven algorithms like processing an execution plan for evaluating queries in a streaming way without block-wise materialization of intermediate steps, which is the case for many-core CPUs and GPUs. FPGAs are so flexible that they can offer any arbitrary type of parallelism. The contribution in Werner et al. (2016) discusses the acceleration of SPARQL query processing via FPGAs, where scalable speedups are achieved with even increasingly larger datasets. Werner et al. (2016) discuss the use of the feature of dynamic partial reconfiguration to dynamically exchange the FPGAs’ configurations to process different queries at runtime without stopping the system for a static reconfiguration, which is dealt with in other contributions.

Universal quantum computing tries to combine the full power of classical computers with quantum computers that manipulate qubits in super position by applying quantum logic gates. In comparison, quantum annealers – operating on up to several thousand qubits – only run special types of quantum algorithms to solve adiabatic (as special form of combinatorial) optimization problems, which is, for example, the case for traffic control7, selecting the execution plan with the best estimated costs (from a set of enumerated plans) (see Trummer and Koch (2016)) and concurrency control between transactions (see Roy et al. (2013)). Some algorithms for optimizing databases have been studied for both types of quantum computers like optimizing transaction schedules (see Bittner and Groppe (2020a, 2020b) for a quantum annealer solution and see Groppe and Groppe (2021) for a solution utilizing universal quantum computers).

Clouds are designed for dynamical allocation of resources like storage and computing according to users’ demands. Cloud databases are especially developed for the cloud environment. Because of the dynamic allocation of resources (for storing and computing), nodes are frequently joining and leaving, which must be dealt with in cloud databases, including the redistribution of data (in the case of joining or leaving storage nodes) and ways to manage processing jobs on leaving nodes. In contrast to servers typically running on high-end hardware with redundant components, commodity hardware is usually used for the (up to several thousand) nodes in clouds with the drawback of more frequent hardware and communication failures. The design of cloud computing architectures considers these failures by applying simple fault-tolerance mechanisms to repeat crashed jobs. Typical queries in databases are one-time queries, which are also supported by cloud systems such as Apache Spark and Apache Flink. In addition, they support data streams and continuous queries of stream databases. Many Semantic Web databases are built on top of the different cloud technologies (see Groppe et al. (2014b) (HBase, Pig), Graux et al. (2016) (Spark) and Azzam et al. (2018) (Flink)). Few other contributions neglect well-known technologies (see Groppe et al. (2014a)) for the benefit of supporting more local joins without the huge redistribution of data for each new join.

New forms of clouds include the web cloud (see Groppe and Reimer (2019)), which supports easy and more ad hoc deployments of nodes to the web cloud: a user can just use their web browser to visit a certain web page, in order to connect their computer to the web cloud. This promises a much larger number of potential nodes, as any computer running a browser may connect to and be integrated in the web cloud by any user worldwide. On the other hand, the nodes may be disconnected more often, which poses new challenges to the technologies for processing jobs in web clouds. As a consequence, data are processed within the browser and browser technologies must be used for data management purposes in so-called web cloud databases. New technologies for the browser can be utilized for data management purposes like WebAssembly (see Rossberg (2019)), which introduces a virtual machine for the browser, promising speed ups for tasks processed in the browser in contrast to running JavaScript. Grall et al. (2017) deal with the first approaches to distribute SPARQL queries in some kind of web clouds.

Mobile databases (see Kumar (2006)) are a special form of databases, which involve the technical infrastructure of mobile providers like base stations (being near-by to their connected mobile devices) for their tasks to be processed. Approaches tailored to the technical infrastructure of mobile providers try to overcome limitations of the mobile devices and promise to lower communication (and hence also energy) costs and increase availability and durability (by logging at the base stations instead on mobile devices). Although some RDF stores, like in Le-Phuoc et al. (2010), are especially designed to run on mobile devices, they do not utilize the backend of mobile providers for advanced mobile processing so far.

Graffi et al. (2010) and Mietz et al. (2013) propose P2P databases, which utilize the features of peer-to-peer (P2P) networks to master a frequent joining and leaving of nodes for data storing and processing. P2P networks are designed for a very frequent change in their topology and do not differentiate master and slave nodes for an equal distribution of functionality. In comparison to cloud databases and because of the frequent changes in the underlying topology due to the frequent disconnections of nodes, P2P databases apply a high redundancy in data storing and processing. Furthermore, the connected nodes are highly heterogeneous, which must be especially dealt with in P2P databases. Many approaches, like in Mietz et al. (2013), already propose semantic data processing in P2P networks. However, these approaches only address ontology inference on a very rudimentary basis at most, and not at all for trigger and continuous queries (see Groppe (2020)).

The IoT often comes along with a large-scale installation of IoT devices together with a sufficient backend infrastructure to handle this large-scale installation. IoT databases (see, for example, ObjectBox Limited (2019)) are especially designed to offer data management functionalities in IoT environments. IoT databases often run in the cloud, and require a high bandwidth from the IoT devices to the cloud resources. Otherwise, there is a communication bottleneck hindering a good scalability of large-scale IoT environments with high velocity of data.

Fog computing (see Abdelshkour (2015)) aims to save communication by avoiding the route over the Internet backbone. For this purpose, fog computing utilizes near-things edge devices with higher capabilities for storing data and processing application logic. One possible drawback is that the near-things edge devices do not increase in number and capabilities in the same way as the IoT devices, because one near-things edge device typically handles a large number of IoT devices. Therefore, fog computing is not really scalable in the number of connected things.

Edge computing (see Garcia Lopez et al. (2015)) tackles the scalability issue by additionally utilizing all IoT devices for the storage of data and processing of application logic: with a larger deployment of IoT devices, more data needs to be stored and processed, which can be compensated with the larger number of available IoT devices.

Dew computing (see Skala et al. (2015) and Wang (2016)) addresses availability problems due to disconnections between cloud and IoT devices. In their proposed architecture, an additional local server is placed near to the IoT devices responsible for tasks during downtimes and for synchronization with the cloud at uptimes.

When studying relevant literature, one may recognize that there are many contributions (see, for example, Mishra and Jain (2020)) introducing corresponding ontologies and targeting on interoperability issues (see Cimmino et al. (2020)), but there are still only a few contributions to semantic IoT databases. It seems to be natural that IoT databases (especially those running on the fog or edge) are often organized as P2P databases. Another obvious choice is to apply dew computing principles. Hence, contributions to P2P networks processing Semantic Web data, like in Mietz et al. (2013), can be run as a kind of semantic IoT database. The distribution of data and processing tasks between cloud and IoT infrastructures, including the IoT devices, becomes one of the major challenges in this area. New challenges also arise when considering the requirements of processing data streams in an efficient way, especially when offering reasoning capabilities, which demand for a high processing capacity. Furthermore, especially in high velocity environments, it may be difficult to find the right trade-off between storing not all data, but enough for future analysis in case of failures.

1.4. Variations of SHM3P DBMS

Figure 1.5 introduces a taxonomy of different types of DBMS, considering if they are multi-platform, multi-model, hybrid in operation spanning over multiple platforms and supporting semantic layers.

Figure 1.5.Taxonomy of different types of databases toward SHM3P databases.

We can classify various DBMSs according to this taxonomy: for example, according to Groppe and Groppe (2020), MySQL is an M3P database running on different platforms (server and cloud, but not on completely different platforms like IoT) supporting the relational, key/value and object data models. In contrast, luposdate3000 (see Warnke et al. (2021) and Groppe et al. (2021a,b)) is an SMP database running on parallel servers, in the browser and soon in IoT platforms supporting only the semantic data model. We have plans to develop luposdate3000 further to a full-fledged SHM3P DBMS in our future work.

It is obvious that transforming databases in any of the directions from support of a single platform over multiple platforms to hybrid support of platforms, from single to multiple data models and from without to with the support of a semantic layer is challenging and needs huge development efforts, but will offer many more possibilities for installations (after supporting more platforms) and for the user (especially after the support of more data models, and also of more platforms when considering additional platform features).

1.5. What are the benefits of SHM3P databases for IoT?

We identify benefits of future SHM3P databases for IoT in the areas of data storage and placement, data processing and applications.

1.5.1. Data storage and placement