E-Book
34,79 €

Getting Started with CockroachDB E-Book

Kishen Das Kondabagilu Rajanna

0,0

34,79 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.
Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Fachliteratur
Sprache: Englisch

Beschreibung

Getting Started with CockroachDB will introduce you to the inner workings of CockroachDB and help you to understand how it provides faster access to distributed data through a SQL interface. The book will also uncover how you can use the database to provide solutions where the data is highly available.
Starting with CockroachDB's installation, setup, and configuration, this SQL book will familiarize you with the database architecture and database design principles. You'll then discover several options that CockroachDB provides to store multiple copies of your data to ensure fast data access. The book covers the internals of CockroachDB, how to deploy and manage it on the cloud, performance tuning to get the best out of CockroachDB, and how to scale data across continents and serve it locally. In addition to this, you'll get to grips with fault tolerance and auto-rebalancing, how indexes work, and the CockroachDB Admin UI. The book will guide you in building scalable cloud services on top of CockroachDB, covering administrative and security aspects and tips for troubleshooting, performance enhancements, and a brief guideline on migrating from traditional databases.
By the end of this book, you'll have gained sufficient knowledge to manage your data on CockroachDB and interact with it from your application layer.

Details

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

MOBI

Seitenzahl: 233

Veröffentlichungsjahr: 2022

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Ähnliche

Für immer aufgeräumt – auch digital

Jürgen Kurz

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Der Weg zum erfolgreichen Unternehmer

Stefan Merath

Denke (nach) und werde reich

Napoleon Hill

30 Minuten Resilienz

Ulrich Siegrist

Krebszellen mögen keine Himbeeren - Der große Bestseller - Vollständig überarbeitet und aktualisiert

Richard Béliveau

Die Hormonrevolution

Michael E Platt

Der Crash ist die Lösung

Matthias Weik

Günter, der innere Schweinehund, lernt verkaufen

Stefan Frädrich

Mission erfüllt

Owen Mark

Die Leber wächst mit ihren Aufgaben

Dr. med. Eckart von Hirschhausen

Macht, was ihr liebt!

Anja Förster

Kopf schlägt Kapital

Günter Faltin

Der größte Raubzug der Geschichte

Matthias Weik

Der Mann und das Holz

Lars Mytting

Unsere Hunde - gesund durch Homöopathie

Hans Günter Wolff

Die Jahrhundertlüge, die nur Insider kennen

Heiko Schrang

Organisation für Komplexität

Niels Pfläging

Power: Die 48 Gesetze der Macht

Robert Greene

The Truth About Employee Engagement

Patrick M. Lencioni

Leseprobe

Getting Started with CockroachDB

A guide to using a modern, cloud-native, and distributed SQL database for your data-intensive apps

Kishen Das Kondabagilu Rajanna

BIRMINGHAM—MUMBAI

Getting Started with CockroachDB

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author(s), nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Publishing Product Manager: Sunith Shetty

Senior Editors: Roshan Kumar, Nazia Shaikh

Content Development Editor: Tazeen Shaikh

Technical Editor: Rahul Limbachiya

Copy Editor: Safis Editing

Project Coordinator: Aparna Ravikumar Nair

Proofreader: Safis Editing

Indexer: Pratik Shirodkar

Production Designer: Jyoti Chauhan

Marketing Coordinator: Priyanka Mhatre

First published: March 2022

Production reference: 1090322

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-80056-065-9

www.packt.com

During my childhood, I have spent more time watching insects than interacting with human beings. So, I dedicate this book to cockroaches and all the other insects that have fascinated me throughout my life.

Contributors

About the author

Kishen Das Kondabagilu Rajanna is currently working as a distributed query engineer at Adobe. His previous experience includes leading the data warehouse team at Cloudera, managing the SaaS platform at Rubrik, writing core services for the Oracle public cloud, and managing data infrastructure at Apple.

About the reviewers

Nadir Doctor is a database and data warehousing architect, plus DBA, who has worked in various industries with multiple OLTP and OLAP technologies, as well as primary data platforms, including CockroachDB, Snowflake, Databricks, DataStax, Cassandra, ScyllaDB, Redis, MS SQL Server, Oracle, Db2 Cloud, AWS, Azure, and GCP. A major focus of his is health check scripting for security, high availability, performance optimization, cost reduction, and operational excellence. He has presented at several technical conference events, is active in user group participation, and can be reached on LinkedIn.

Thank you to Kishen and all the staff at Packt. I'm grateful for the immense support of my loving wife, children, and family during the technical review of this book. I hope that you all find the content enjoyable, inspiring, and useful.

Scott Ling is a technology specialist with over 30 years experience of working at the forefront of technology in various roles, in companies from start-ups to 10 bn+ listed companies, with a focus on distributed technologies, software as a service, and product/project management. He is currently working on a free product designed to make it easy for anyone to create and manage an SaaS product, service, or business as his way of giving back to the community. Scott is also an established technical author with a bestselling book on .NET published back in 2001 and has worked with authors and publishers on various books and technologies over the years.

Preface

Section 1: Getting to Know CockroachDB

Chapter 1: CockroachDB – A Brief Introduction

The history and evolution of databases

SQL

Object-oriented databases

NoSQL

NewSQL

Database concepts

Cardinality

Overview of database models

Processing models

Embedded and mobile databases

Database storage engines

CAP theorem

Consistency and partition tolerance (CP)

Availability and partition tolerance (AP)

Consistency and availability (CA)

CockroachDB

Why yet another database?

Inspiration

Key terms and concepts

High-level overview

Summary

Chapter 2: How Does CockroachDB Work Internally?

Technical requirements

Installing a single-node CockroachDB cluster using Docker

Execution of a SQL query

SQL query execution

Parsing

Logical planning

Physical planning

Query execution

Managing a transactional key-value store

Data distribution across multiple nodes

The MSKVS

Meta ranges

Table data

Data replication for resilience and availability

What is consensus?

The Raft distributed consensus protocol

Interactions with the disk for data storage

Storage engine

Summary

Section 2: Exploring the Important Features of CockroachDB

Chapter 3: Atomicity, Consistency, Isolation, and Durability (ACID)

An overview of ACID properties

Atomicity

Consistency

Isolation

Durability

ACID from CockroachDB's perspective

Atomicity

Consistency

Isolation

Durability

Summary

Chapter 4: Geo-Partitioning

Technical requirements

Introduction to geo-partitioning

Cloud, regions, and zones

Region

Zone

Regions and zones on various cloud providers

Geo-partitioning in CockroachDB

Single region

Multi-region

Summary

Chapter 5: Fault Tolerance and Auto-Rebalancing

Technical requirements

Achieving fault tolerance

Achieving fault tolerance at the storage layer

Working example of fault tolerance at play

Automatic rebalancing

Recovering from multi-node failures

Summary

Chapter 6: How Indexes Work in CockroachDB

Technical requirements

Introduction to indexes

Different types of indexes

Primary indexes

Secondary indexes

Hash-sharded indexes

Duplicate indexes

Inverted indexes

Partial indexes

Spatial indexes

Table joins and indexes

Best practices while using indexes

Summary

Section 3: Working with CockroachDB

Chapter 7: Schema Creation and Management

Technical requirements

DDL

CREATE

ALTER

DROP

DML

DQL

Supported data types

Column-level constraints

Table joins

Using sequences

Managing schema changes

Summary

Chapter 8: Exploring the Admin User Interface

Technical requirements

Introducing the admin UI

Cluster overview

Metrics deep dive

Database and table definitions

Understanding sessions

Transactions

Tracking jobs

Summary

Chapter 9: An Overview Of Security Aspects

Technical requirements

Introduction to security concepts

Client and node authentication

Generating certificates and keys

Client authentication

Node authentication

Authorization mechanisms

Roles

Privileges

Data encryption at rest and in flight

Encryption at rest

Encryption in flight

Audit logging

RTO and RPO

Keeping the network secure

Security best practices

Summary

Chapter 10: Troubleshooting Issues

Technical requirements

Collecting debug logs

Log files

Log levels

Log channels

Emitting logs to an external sink

Gathering Cockroach debug logs

Connection issues

Tracking slow queries

Capacity planning

Configuration issues

Guidelines to avoid issues during an upgrade

Network latency

Advanced debugging options

Summary

Chapter 11: Performance Benchmarking and Migration

Technical requirements

Performance – Things to consider

Infrastructure

Popular benchmark suites

Benchmarking your specific use cases

Performance benchmarking for CockroachDB

Migration – Things to consider

Migrating from traditional databases

Migrating from PostgreSQL to CockroachDB

Summary

Appendix: Bibliography and Additional Resources

Other Books You May Enjoy

Preface

This book will introduce you to the inner workings of CockroachDB and help you understand how it provides faster access to distributed data through a SQL interface. You'll learn how you can use the database to provide solutions that require data to be highly available.

Starting with CockroachDB's installation, setup, and configuration, this book will familiarize you with the database architecture and database design principles. You'll then discover several options that CockroachDB provides to store multiple copies of your data to ensure fast data access. The book covers the internals of CockroachDB, how to deploy and manage it on the cloud, performance tuning to get the best out of CockroachDB, and how to scale data across continents and serve it locally. In addition to this, you'll get to grips with fault tolerance and auto-rebalancing, how indexes work, and the CockroachDB Admin UI. The book will guide you in building scalable cloud services on top of CockroachDB, covering administrative and security aspects and tips for troubleshooting, performance enhancements, and a brief guideline on migrating from traditional databases.

By the end of this book, you'll have enough knowledge to manage your data on CockroachDB and interact with it from your application layer.

Who this book is for

Software engineers, database developers, database administrators, and anyone who wishes to learn about the features of CockroachDB and how to build database solutions that are fast, highly available, and cater to business-critical applications, will find this book useful. Although no prior exposure to CockroachDB is required, familiarity with database concepts will help you to get the most out of this book.

What this book covers

Chapter 1, CockroachDB – A Brief Introduction, talks about databases and how they have evolved over time. You will also get to know about the high-level architecture of CockroachDB.

Chapter 2, How Does CockroachDB Work Internally?, explores various layers of CockroachDB and some of its inner workings.

Chapter 3, Atomicity, Consistency, Isolation, and Durability (ACID), introduces you to ACID properties and how they are implemented in CockroachDB.

Chapter 4, Geo-Partitioning, explains the concept of geo-partitioning, why we need it, and what are the various options for geographically distributing the data using CockroachDB

Chapter 5, Fault Tolerance and Auto-Rebalancing, explores the concept of fault tolerance and auto-recovery strategies. It also covers a few experiments to understand these concepts better.

Chapter 6, How Indexes Work in CockroachDB, is all about database indexes, how they are useful in improving query performance, the different types of indexes that are supported in CockroachDB, and some of the best practices that you can follow while using indexes.

Chapter 7, Schema Creation and Management, introduces you to SQL syntaxes for DDL, DML, and DQL with examples, different data types available in CockroachDB, sequences, and how to manage schema changes.

Chapter 8, Exploring the Admin User Interface, explores the admin user interface that comes by default when you deploy a CockroachDB cluster. We examine all the metrics and other information that are available in the user interface and how they are useful in troubleshooting issues.

Chapter 9, An Overview of Security Aspects, touches upon the key security aspects that you have to pay attention to when using CockroachDB. We also learn about authentication, authorization, how to protect data with encryption while at rest and in flight, and how to achieve desired data protection by defining the correct strategies for RTO and RPO. Network security and some security best practices are also covered.

Chapter 10, Troubleshooting Issues, helps you in getting yourself familiarized with troubleshooting issues by collecting logs, looking at some of the metrics, understanding and tracking slow queries, and integrating logs with external sinks. This chapter also covers advanced debugging options at the end.

Chapter 11, Performance Benchmarking and Migration, discusses performance, what the key indicators of performance are, and how to measure them. You will also learn about migrating from other traditional databases to CockroachDB.

Appendix: Bibliography and Additional Resources, provides additional resource material that you can go through to become more familiar with CockroachDB.

To get the most out of this book

You should have access to the internet so that you can download CockroachDB and try it on your laptop.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book's GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Getting-Started-with-CockroachDB. In case there's an update to the code, it will be updated on the existing GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781800560659_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in the text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Here, in the SELECT query, you should use AS OF SYSTEM TIME follower_read_timestamp()."

A block of code is set as follows:

SHOW TABLES

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

DROP DATABASE <DATABASE_NAME>

DROP ROLE <ROLE_NAME>

DROP TABLE <TABLE_NAME>

Any command-line input or output is written as follows:

$ cockroach cert create-client <user_name> \

--certs-dir=<certs_directory> \

--ca-key=<CA_key_directory>

Bold: Indicates a new term, an important word, or words that you see on screen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "The Sessions dashboard gives information about all the active client sessions within the CockroachDB cluster."

Tips or Important Notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you've read Getting Started with CockroachDB, we'd love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we're delivering excellent quality content.

Section 1: Getting to Know CockroachDB

In this section, we will provide a brief introduction to CockroachDB and the motivation behind creating this new database, as well as go into its overall architecture and design concepts.

This section comprises the following chapters:

Chapter 1, CockroachDB - A Brief IntroductionChapter 2, How Does CockroachDB Work Internally?

Chapter 1: CockroachDB – A Brief Introduction

In this chapter, we will go over the history of databases, where we will learn about the evolution of SQL, NoSQL, and NewSQL databases, various relational models, different categories for classifying databases, and timelines. Later, we will discuss the CAP theorem. Finally, we will briefly discuss the motivation for creating a new database and learn about the basic architecture of CockroachDB.

The following topics will be covered in this chapter:

The history and evolution of databasesDatabase conceptsCAP theorem CockroachDB

The history and evolution of databases

A database is a collection of data that can be organized, managed, modified, and retrieved using a computer. The system that helps with managing data in a database is called a database management system (DBMS).

In the 1950s and 1960s, several advancements were made in terms of processors, storage, memory, and networks. We also had our first programming languages, COBOL and FORTRAN. The development of hard disk drives for data storage further spurred the development of databases. Around the same time, the first notion of a modern-day computer with a mouse and graphical user interface came into existence, making it easy for the general public to consume it. In this section, we will discuss how various types of databases evolved.

SQL

The first database was designed by Charles William Bachman III, an American computer scientist. In 1963, he developed the Integrated Data Store (IDS), which gave rise to the concept of the navigational database. In navigational databases, we can find records by chasing references from other objects. For example, let's say that in a school database, you want to find all the students from a specific grade in a specific school. In a navigational database, first, you have to go to the group of students that belong to a particular school and then to the group that belongs to a particular grade. So, records can be accessed by hierarchical navigation. Based on IDS, Bachman later developed the CODASYL database model in 1969. CODASYL stands for Conference/Committee on Data Systems Languages, which was a consortium to guide the development of programming languages. Around the same time Edgar F. Codd, an IBM employee, developed the IBM Information Management System (IMS), which was based on the hierarchical database model. A hierarchical database model is a data model in which the data is designed in a tree-like structure. In 1970, Donald D. Chamberlin and Raymond F. Boyce developed Structured Query Language (SQL) based on what they'd learned about IMS. They initially called it Structured English Query Language (SEQUEL), which System R was later developed with by a group at the IBM San Jose research laboratory. In 1976, QUEL, which is a relational database query language designed by Michael Ralph Stonebraker, was developed as part of the Interactive Graphics Retrieval System (INGRES) database management system at the University of California, Berkeley.

Based on QUEL and SQL, several databases were implemented. Some of the most prominent ones include Post Ingres (Postgres), Sybase, Microsoft SQL, IBM DB2, Oracle, MariaDB, and MySQL.

Object-oriented databases

In the 1980s, object-oriented database systems (OODBMSes) grew in popularity. In OODBMSes, information is represented as objects compared to tables in relational databases. Some of the important ones include Gemstone/S, Objectivity/DB, InterSystems Cache, Perst, ZODB, Wakanda, ObjectDB, ODABA, and Realm.

NoSQL

The concept of non-SQL or non-relational databases has existed since the 1960s, but the term NoSQL became has much more popular in the last decade. NoSQL databases focus on performance and scaling and mostly rely on a non-relational data model such as a document, key-value, wide-column, or graph to organize the data. Some of the most popular ones in this category include Cassandra, MongoDB, Couchbase, Dynamo, FoundationDB, Neo4j, and Hbase.

NewSQL

With the introduction of the on-demand availability of compute, storage, and network resources and the pay-as-you-go model, which is collectively known as cloud computing, the amount of data that we collect, process, manage, and analyze has been growing exponentially. Although it was relatively easier for some of the NoSQL databases to adapt to the cloud, it is still much harder for traditional SQL databases to do so. Many of them are better suited for vertical scaling and do not consider geographically distributed data, the shared-nothing architecture, and enormous scale as part of their initial design. This created a void. We needed SQL databases that are cloud-native, scale well with data growth, and are easy to manage. Many companies developed in-house solutions on top of existing SQL databases:

Facebook developed TAO, a NoSQL graph API built on top of sharded MySQL. YouTube developed Vitess to easily scale and manage MySQL clusters. Dropbox developed Edgestore, a metadata store to power their services and products, which again was built on top of MySQL.GreenPlum developed a massively parallel data platform by the same name for analytics, machine learning, and AI on top of Postgres.

However, it was still relatively hard and painful to manage the data as the underlying database was not built to scale.

In 2012, Google published a seminal paper on Google Spanner: a globally distributed database service and storage solution. Spanner essentially combined the important features of SQL databases such as ACID transactions, strongly consistent reads, and the SQL interface with some of the features that were only available with NoSQL databases, such as scaling across geographical locations, multi-site replication, and failover. It created a new category of databases called NewSQL, which is meant to indicate a combination of SQL features at NoSQL scale. YugabyteDB and CockroachDB were developed later, both of which got their inspiration from Google Spanner.

Database concepts

In this section, we will learn about some of the core database concepts, including cardinality, database models, and various processing models.

Cardinality

Before we discuss database models, it is important to know about cardinality. Cardinality refers to the relationship between two entities or tables. The most popular ones include one-to-many, many-to-one, and many-to-many.

One-to-one relationship

In the case of a one-to-one relationship, a row or entry in one entity or table can be related to only one row in another entity or table. For example, in a Department of Motor Vehicles database, let's say there are two tables called License Info and Driver Info, as shown in the following diagram:

Figure 1.1 – An example of a one-to-one relationship

Here, Driver ID can only be assigned to one driver as it has to uniquely identify a driver. Also, a driver can only be assigned one Driver ID. So, here, any row in the License Info table will be associated with a specific row in the Driver Info table.

One-to-many relationship

In a one-to-many relationship, a single row from one entity or table can be associated with multiple rows in another entity or table.

For example, let's consider the Driver Info and City Info tables shown in the following diagram::

Figure 1.2 – An example of a one-to-many relationship

Here, for every row in City Info, there will be multiple rows in Driver Info, as there can be many drivers that live in a particular city.

Many-to-many relationship

In a many-to-many relationship, a single row in one entity or table can be associated with multiple rows in another entity or table and vice versa.

For example, let's consider two tables: Vehicle Ownership History, where we are maintaining the history of ownership of a given vehicle, and Driver Ownership History, where we are maintaining the history of vehicles owned by a given driver:

Figure 1.3 – An example of a many-to-many relationship

Here, a driver can own multiple vehicles and a vehicle can have multiple owners over time. So, a given row in the Vehicle Ownership History table can be associated with multiple rows in the Driver Ownership History table. Similarly, a given row in the Driver Ownership History table can be associated with multiple rows in the Vehicle Ownership History table.

Now, let's take a look at some of the most important database models.

Overview of database models

A database model determines how the data is stored, organized, and modified. Databases are typically implemented based on a specific data model. It is also possible to borrow concepts from multiple database models when you are designing a new database. The relational database model happens to be the most widely known and has been popularized by databases such as Oracle, IBM DB2, and MySQL.

Hierarchical database model

In the hierarchical database model, the data is organized in the form of a tree. There is a root at the first level and multiple children at the subsequent levels. Since a single parent can have multiple children, one-to-many relationships can easily be represented here. A child cannot have multiple parents, so this results in the advantage of not being able to model many-to-many relationships.

IBM's Information Management System (IMS) was the first database that implemented this data model.

The following diagram shows an example of a hierarchical database model:

Figure 1.4 – An example of a hierarchical database model

Typically, the tree starts with a single root and the data is organized into this tree. Any node except the leaves can have multiple children, but a child can have only one parent.

Network model

The network model was developed as an enhancement of the hierarchical database model to accommodate many-to-many relationships. The network model relies on a graph structure to organize its data. So, there is no concept of a single root, and a child can have multiple parents and a parent can have multiple children. Integrated Data Store (IDS), Integrated Database Management Systems (IDMS), and Raima Database Manager (RDM) are some of the popular databases that use the network model.

As shown in the following diagram, there is no single root and a given child (for example, Object 2 can have multiple parents; that is, Object 1 and Object 3):

Figure 1.5 – An example of a network model

Relational model

Although the network model was an improvement over the hierarchical model, it was still a little restrictive when it came to representing data. In the relational model, any record can have a relationship with any other with the help of a common field. This drastically reduced the design's complexity and made it easier to independently add, update, and access records, without having to walk down the tree or traverse the graph. SQL was combined with the relational database model to provide a simple query interface to add and retrieve data.

All the popular traditional databases such as Oracle database, IBM DB2, MySQL, MariaDB, and Microsoft SQL Server implement relational data models.

Let's look at two tables called Employee and Employee Info:

Figure 1.6 – Employee tables showing the column names

Here, Employee ID is the common field or column between the Employee and Employee Info tables. The Employee table is responsible for ensuring that a given Employee ID is unique, while Employee Info is responsible for more detailed information about a given employee.

Object-relational model

The object-relational model, as the name suggests, combines the best of the relational and object data models. The concept of objects, classes, and inheritance are directly supported as first-class citizens as part of the database and in queries. SQL:1999, the fourth revision of SQL, introduced several features for embedding object concepts into the relational database. One of the main features was to create structured user-defined types with CREATE TYPE to define an object's structure.

Over time, relational databases have added more support for objects. There is a varying degree of support for object concepts in Oracle database, IBM DB2, PostgreSQL, and Microsoft SQL Server.

Given the scope of this book, we will not discuss the entity-relational model, object model, document model, star schema, snowflake schema, and many other less well-known models.

Now, let's look at how databases can be classified based on what kinds of workload they can be used for.

Processing models

Based on how you want to consume and process data, databases can be categorized into four different processing systems. Let's take a look.

Online transaction processing (OLTP)

OLTP systems support the concept of transactions. A transaction refers to the ability to atomically apply changes (insert, update, delete, and read) to a given system. One popular example is a bank, where withdrawing or depositing money to a given bank account must be done atomically to ensure data is not lost or incorrect. So, the main purpose here is to maintain data integrity and consistency. Also, these systems are generally suited for fast-running queries.

Online analytical processing (OLAP)

OLAP focuses mostly on running queries to analyze multi-dimensional data and to extract some intelligence or patterns from it. Typically, such systems support generating some sort of report that can be used for marketing, sales, financing, budgeting, management, and many more. Data mining and data analytics applications would typically have to have an OLAP system in some form. OLAP doesn't deal with transactions, and the emphasis is more on analyzing large amounts of data from different sources to extract business intelligence. Some databases also provide built-in support for MapReduce to run queries across a large set of data.

A data warehouse is a piece of software that's used for reporting and data analysis. Warehouses are typically developed for OLAP. It is also very common to retrieve the data from OLTP in batches or bulk, run it through an Extract, Load, and Transform (ELT) or Extract, Transform, and Load (ETL) data transformation pipeline, and store it in an OLAP system.

Online event processing (OLEP)

OLEP guarantees strong consistency without the traditional atomic commit protocols or distributed locking. OLEP also focuses on high performance, larger scales, and fault tolerance.

Hybrid transaction/analytical processing (HTAP)

As the name suggests, this

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben:

Getting Started with CockroachDB E-Book

Kishen Das Kondabagilu Rajanna

Getting Started with CockroachDB

Getting Started with CockroachDB

Contributors

About the author

About the reviewers

Table of Contents

Preface

Section 1: Getting to Know CockroachDB

Chapter 1: CockroachDB – A Brief Introduction

The history and evolution of databases

SQL

Object-oriented databases

NoSQL

NewSQL

Database concepts

Cardinality

Overview of database models

Processing models

Embedded and mobile databases

Database storage engines

CAP theorem

Consistency and partition tolerance (CP)

Availability and partition tolerance (AP)

Consistency and availability (CA)

CockroachDB

Why yet another database?

Inspiration

Key terms and concepts

High-level overview

Summary

Chapter 2: How Does CockroachDB Work Internally?

Technical requirements

Installing a single-node CockroachDB cluster using Docker

Execution of a SQL query

SQL query execution

Parsing

Logical planning

Physical planning

Query execution

Managing a transactional key-value store

Data distribution across multiple nodes

The MSKVS

Meta ranges

Table data

Data replication for resilience and availability

What is consensus?

The Raft distributed consensus protocol

Interactions with the disk for data storage

Storage engine

Summary

Section 2: Exploring the Important Features of CockroachDB

Chapter 3: Atomicity, Consistency, Isolation, and Durability (ACID)

An overview of ACID properties

Atomicity

Consistency

Isolation

Durability

ACID from CockroachDB's perspective

Atomicity

Consistency

Isolation

Durability

Summary

Chapter 4: Geo-Partitioning

Technical requirements

Introduction to geo-partitioning

Cloud, regions, and zones

Region

Zone

Regions and zones on various cloud providers

Geo-partitioning in CockroachDB

Single region

Multi-region

Summary

Chapter 5: Fault Tolerance and Auto-Rebalancing

Technical requirements

Achieving fault tolerance

Achieving fault tolerance at the storage layer