Ceph Cookbook - Second Edition - Vikhyat Umrao - E-Book

Ceph Cookbook - Second Edition E-Book

Vikhyat Umrao

0,0
41,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Over 100 effective recipes to help you design, implement, and troubleshoot manage the software-defined and massively scalable Ceph storage system.

About This Book

  • Implement a Ceph cluster successfully and learn to manage it.
  • Recipe based approach in learning the most efficient software defined storage system
  • Implement best practices on improving efficiency and security of your storage cluster
  • Learn to troubleshoot common issues experienced in a Ceph cluster

Who This Book Is For

This book is targeted at storage and cloud engineers, system administrators, or anyone who is interested in building software defined storage, to power your cloud or virtual infrastructure.

If you have basic knowledge of GNU/Linux and storage systems, with no experience of software defined storage solutions and Ceph, but eager to learn then this book is for you

What You Will Learn

  • Understand, install, configure, and manage the Ceph storage system
  • Get to grips with performance tuning and benchmarking, and learn practical tips to help run Ceph in production
  • Integrate Ceph with OpenStack Cinder, Glance, and Nova components
  • Deep dive into Ceph object storage, including S3, Swift, and Keystone integration
  • Configure a disaster recovery solution with a Ceph Multi-Site V2 gateway setup and RADOS Block Device mirroring
  • Gain hands-on experience with Ceph Metrics and VSM for cluster monitoring
  • Familiarize yourself with Ceph operations such as maintenance, monitoring, and troubleshooting
  • Understand advanced topics including erasure-coding, CRUSH map, cache pool, and general Ceph cluster maintenance

In Detail

Ceph is a unified distributed storage system designed for reliability and scalability. This technology has been transforming the software-defined storage industry and is evolving rapidly as a leader with its wide range of support for popular cloud platforms such as OpenStack, and CloudStack, and also for virtualized platforms. Ceph is backed by Red Hat and has been developed by community of developers which has gained immense traction in recent years.

This book will guide you right from the basics of Ceph , such as creating blocks, object storage, and filesystem access, to advanced concepts such as cloud integration solutions. The book will also cover practical and easy to implement recipes on CephFS, RGW, and RBD with respect to the major stable release of Ceph Jewel. Towards the end of the book, recipes based on troubleshooting and best practices will help you get to grips with managing Ceph storage in a production environment.

By the end of this book, you will have practical, hands-on experience of using Ceph efficiently for your storage requirements.

Style and approach

This step-by-step guide is filled with practical tutorials, making complex scenarios easy to understand.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 374

Veröffentlichungsjahr: 2017

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Ceph Cookbook

Second Edition

 

 

 

 

 

 

 

 

 

 

Practical recipes to design, implement, operate, and manage Ceph storage systems

 

 

 

 

 

 

 

 

 

 

Vikhyat Umrao
Michael Hackett
Karan Singh

 

 

 

 

 

 

 

 

 

BIRMINGHAM - MUMBAI

 

Ceph Cookbook

Second Edition

 

Copyright © 2017 Packt Publishing

 

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

 

First published: February 2016

Second edition: November 2017

Production reference: 1221117

Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.

ISBN 978-1-78839-106-1

 

www.packtpub.com

Credits

Authors

Vikhyat Umrao

Michael Hackett

Karan Singh

Copy Editors

Safis Editing

Juliana Nair

 

Reviewer

Álvaro Soto

Project Coordinator

Judie Jose

Commissioning Editor

Gebin George

Proofreader

Safis Editing

Acquisition Editor

Shrilekha Inani

Indexer

Pratik Shirodkar

Content Development Editor

Nikita Pawar

Graphics

Tania Dutta

Technical Editor

Mohd Riyan Khan

Production Coordinator

Deepika Naik

Disclaimer

The views expressed in this book are those of the authors and not of Red Hat.

Foreword

Since Ceph's inception and posting to GitHub in 2008, Sage Weil’s creation has grown from an individual idea to a successful open source project with over 87,000 commits from almost 900 different contributors from dozens of companies. Originally incubated as a skunkworks project inside DreamHost, an earlier Sage startup focused on web hosting. In 2012, it was spun off into a new company, Inktank, dedicated to focusing exclusively on the development and support of Ceph. Inktank’s reputation was built upon the stellar customer support its employees provided, from on-site installation and configuration, to highly complex bug troubleshooting, through patch development, and ultimate problem resolution. The DNA of the company was a dedication to customer success, even when it required a senior developer to join a customer support teleconference at short notice, or even Sage himself to remotely log in to a server and assist in diagnosing the root cause of an issue.

This focus on the customer was elevated even further after Inktank’s acquisition by Red Hat in 2014. If Red Hat is known for anything, it’s for making CIOs comfortable with open source software. Members of Red Hat’s customer experience and engagement team are some of the most talented individuals I’ve had the pleasure to work with. They possess the unique ability to blend the technical troubleshooting skills necessary to support a complex distributed storage system with the soft relational skills required to be on the front lines engaging with a customer, who in many cases is in an extremely stressful situation where production clusters are out of operation.

The authors of this book are two of the finest exemplars of this unique crossing of streams. Inside this work, Vikhyat and Michael share some of their hard-earned best practices in successfully installing, configuring, and supporting a Ceph cluster. Vikhyat has 9 years of experience providing sustaining engineering support for distributed storage products with a focus on Ceph for over 3 years. Michael has been working in the storage industry for over 12 years and has been focused on Ceph for close to 3 years. They both have the uncommon ability to calmly work through complex customer escalations, providing a first class experience with Ceph under even the most stressful of situations. Between the two of them, you’re in good hands—the ones that have seen some of the hairiest, most difficult-to-diagnose problems and have come out the other side to share their hard-earned wisdom with you.

If there’s been one frequent critique of Ceph over the years, it’s been that it’s too complex for a typical administrator to work with. Our hope is that the ones who might be intimidated by the thought of setting up their first Ceph cluster will find comfort and gain confidence from reading this book. After all, there’s no time like the present to start playing with The Future of Storage™. :-)

Ian R. Colle

Global Director of Software Engineering, Red Hat Ceph Storage

About the Authors

Vikhyat Umrao has 9 years of experience with distributed storage products as a sustenance engineer and in the last couple of years, he has been working on software-defined storage technology, with specific expertise in Ceph Unified Storage. He has been working on Ceph for over 3 years now and in his current position at Red Hat, he focuses on the support and development of Ceph to solve Red Hat Ceph storage customer issues and upstream reported issues.

He is based in the Greater Boston area, where he is a principal software maintenance engineer for Red Hat Ceph Storage. Vikhyat lives with his wife, Pratima, and he likes to explore new places.

I'd like to thank my wife, Pratima, for keeping me motivated so that I could write this book and give time I could have spent with her to this book. I would also like to thank my family (mom and dad) and both my sisters (Khyati and Pragati) for their love, support, and belief in me that I would do better in my life.
 
I would also like to thank Red Hat for giving me an opportunity to work on such a wonderful storage product, and my colleagues who make a great team. I would like to thank the Ceph community and developers for constantly developing, improving, and supporting Ceph.
 
I would like to thank Michael—my friend, colleague, and co-author of this book—for making a great team for this book as we make a great team at Red Hat. I would also like to thank Karan, who wrote the first version of this book, for giving us the base for version two.
 
Finally, I would like to thank the entire team at Packt and a team of technical reviewers for giving us this opportunity, and the hard work they put in beside us while we wrote this book.

Michael Hackett is a storage and SAN expert in customer support. He has been working on Ceph and storage-related products for over 12 years. Apart from this, he holds several storage and SAN-based certifications, and prides himself on his ability to troubleshoot and adapt to new complex issues.

Michael is currently working at Red Hat, based in Massachusetts, where he is a principal software maintenance engineer for Red Hat Ceph and the technical product lead for the global Ceph team.

Michael lives in Massachusetts with his wife, Nicole, his two sons, and their dog. He is an avid sports fan and enjoys time with his family.

I’d like to thank my wife, Nicole, for putting up with some long nights while I was working on this book and understanding the hard work and dedication—a role in the software industry requires. I would also like to thank my two young sons for teaching me more about patience and myself than I ever thought was possible (love you more than you will ever know L and C). Also thank you to my family (mom and dad) and in-laws (Gary and Debbie) for the constant and continued support in everything I do. I would also like to thank Red Hat and my colleagues who truly make each day of work a pleasure. Working at Red Hat is a passion and the drive to make software better is within every Red Hatter. I would like to thank the Ceph community and the developers for constantly developing, improving, and supporting Ceph. Finally, I would like to thank the entire team at Packt and the team of technical reviewers for giving us this opportunity, and the hard work they put in beside us, while we wrote this book.

Karan Singh devotes a part of his time in learning emerging technologies and enjoys the challenges that come with it. He loves tech writing and is an avid blogger. He also authored the first edition of Learning Ceph and Ceph Cookbook, Packt Publishing. You can reach him on Twitter at @karansingh010.

I’d like to thank my wife, Monika, for giving me wings to fly.  I’d also like to thank my employer, Red Hat, for giving me an opportunity to work on cutting-edge technologies and to be a part of the world class team I work with. Finally, special thanks to Vikhyat and Michael for putting great effort into the continued success of Ceph Cookbook.

About the Reviewer

Álvaro Soto is a cloud and open source enthusiast. He was born in Chile, but he has been living in Mexico for more than 10 years now. He is an active member of OpenStack and Ceph communities in Mexico. He holds an engineering degree in computer science from Instituto Politécnico Nacional (IPN México) and he is on his way to getting a master's degree in computer science at Instituto Tecnológico Autónomo de México (ITAM).

Álvaro currently works as a Ceph consultant at Sentinel.la doing architecting, implementing, and performance tuning on Ceph clusters, data migration, and new ways to adopt Ceph solutions.

He enjoys his time reading about distributing systems, automation, Linux, security-related papers and books. You can always contact him by email [email protected], on IRC using the nickname khyr0n, or on Twitter @alsotoes.

I would like to thank Haidee, my partner in life, and Nico for his affect and pleasant company while reviewing this book. To the guys at Sentinel.la, their advices and work for OpenStack and Ceph communities.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

Customer Feedback

Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1788391063.

If you'd like to join our team of regular reviewers, you can email us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!

Table of Contents

Preface

What this book covers

What you need for this book

Who this book is for

Sections

Getting ready

How to do it…

How it works…

There's more…

See also

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

Ceph – Introduction and Beyond

Introduction

Ceph – the beginning of a new era

Software-defined storage – SDS

Cloud storage

Unified next-generation storage architecture

RAID – the end of an era

RAID rebuilds are painful

RAID spare disks increases TCO

RAID can be expensive and hardware dependent

The growing RAID group is a challenge

The RAID reliability model is no longer promising

Ceph – the architectural overview

Planning a Ceph deployment

Setting up a virtual infrastructure

Getting ready

How to do it...

Installing and configuring Ceph

Creating the Ceph cluster on ceph-node1

How to do it...

Scaling up your Ceph cluster

How to do it…

Using the Ceph cluster with a hands-on approach

How to do it...

Working with Ceph Block Device

Introduction

Configuring Ceph client

 How to do it...

Creating Ceph Block Device

How to do it...

Mapping Ceph Block Device

How to do it...

Resizing Ceph RBD

How to do it...

Working with RBD snapshots

How to do it...

Working with RBD clones

How to do it...

Disaster recovery replication using RBD mirroring

How to do it...

Configuring pools for RBD mirroring with one way replication

How to do it...

Configuring image mirroring

How to do it...

Configuring two-way mirroring

How to do it...

See also

Recovering from a disaster!

How to do it...

Working with Ceph and OpenStack

Introduction

Ceph – the best match for OpenStack

Setting up OpenStack

How to do it...

Configuring OpenStack as Ceph clients

How to do it...

Configuring Glance for Ceph backend

How to do it…

Configuring Cinder for Ceph backend

How to do it...

Configuring Nova to boot instances from Ceph RBD

How to do it…

Configuring Nova to attach Ceph RBD

How to do it...

Working with Ceph Object Storage

Introduction

Understanding Ceph object storage

RADOS Gateway standard setup, installation, and configuration

Setting up the RADOS Gateway node

How to do it…

Installing and configuring the RADOS Gateway

How to do it…

Creating the radosgw user

How to do it…

See also…

Accessing the Ceph object storage using S3 API

How to do it…

Configuring DNS

Configuring the s3cmd client

Configure the S3 client (s3cmd) on client-node1

Accessing the Ceph object storage using the Swift API

How to do it...

Integrating RADOS Gateway with OpenStack Keystone

How to do it...

Integrating RADOS Gateway with Hadoop S3A plugin 

How to do it...

Working with Ceph Object Storage Multi-Site v2

Introduction

Functional changes from Hammer federated configuration

RGW multi-site v2 requirement

Installing the Ceph RGW multi-site v2 environment 

How to do it...

Configuring Ceph RGW multi-site v2

How to do it...

Configuring a master zone

Configuring a secondary zone

Checking the synchronization status 

Testing user, bucket, and object sync between master and secondary sites

How to do it...

Working with the Ceph Filesystem

Introduction

Understanding the Ceph Filesystem and MDS

 Deploying Ceph MDS

How to do it...

Accessing Ceph FS through kernel driver

How to do it...

Accessing Ceph FS through FUSE client

How to do it...

Exporting the Ceph Filesystem as NFS

How to do it...

Ceph FS – a drop-in replacement for HDFS

Monitoring Ceph Clusters

Introduction

Monitoring Ceph clusters – the classic way

How to do it...

Checking the cluster's health

Monitoring cluster events

The cluster utilization statistics

Checking the cluster's status

The cluster authentication entries

Monitoring Ceph MON

How to do it...

Checking the MON status

Checking the MON quorum status

Monitoring Ceph OSDs

How to do it...

OSD tree view

OSD statistics

Checking the CRUSH map

Monitoring PGs

Monitoring Ceph MDS

How to do it...

Introducing Ceph Metrics and Grafana

collectd

Grafana

Installing and configuring Ceph Metrics with the Grafana dashboard

How to do it...

Monitoring Ceph clusters with Ceph Metrics with the Grafana dashboard

How to do it ...

Operating and Managing a Ceph Cluster

Introduction

Understanding Ceph service management

Managing the cluster configuration file

How to do it...

Adding monitor nodes to the Ceph configuration file

Adding an MDS node to the Ceph configuration file

Adding OSD nodes to the Ceph configuration file

Running Ceph with systemd

How to do it...

Starting and stopping all daemons

Querying systemd units on a node

Starting and stopping all daemons by type

Starting and stopping a specific daemon

Scale-up versus scale-out

Scaling out your Ceph cluster

How to do it...

Adding the Ceph OSD

Adding the Ceph MON

There's more...

Scaling down your Ceph cluster

How to do it...

Removing the Ceph OSD

Removing the Ceph MON

Replacing a failed disk in the Ceph cluster

How to do it...

Upgrading your Ceph cluster

How to do it...

Maintaining a Ceph cluster

How to do it...

How it works...

Throttle the backfill and recovery:

Ceph under the Hood

Introduction

Ceph scalability and high availability

Understanding the CRUSH mechanism

CRUSH map internals

How to do it...

How it works...

CRUSH tunables

The evolution of CRUSH tunables

Argonaut – legacy

Bobtail – CRUSH_TUNABLES2

Firefly – CRUSH_TUNABLES3

Hammer – CRUSH_V4

 Jewel – CRUSH_TUNABLES5

Ceph and kernel versions that support given tunables

Warning when tunables are non-optimal

A few important points

Ceph cluster map

High availability monitors

Ceph authentication and authorization

Ceph authentication

Ceph authorization

How to do it…

I/O path from a Ceph client to a Ceph cluster

Ceph Placement Group

How to do it…

Placement Group states

Creating Ceph pools on specific OSDs

How to do it...

Production Planning and Performance Tuning for Ceph

Introduction

The dynamics of capacity, performance, and cost

Choosing hardware and software components for Ceph

Processor

Memory

Network

Disk

Partitioning the Ceph OSD journal

Partitioning Ceph OSD data

Operating system

OSD filesystem

Ceph recommendations and performance tuning

Tuning global clusters

Tuning Monitor

OSD tuning

OSD general settings

OSD journal settings

OSD filestore settings

OSD recovery settings

OSD backfilling settings

OSD scrubbing settings

Tuning the client

Tuning the operating system

Tuning the network

Sample tuning profile for OSD nodes

How to do it...

Ceph erasure-coding

Erasure code plugin

Creating an erasure-coded pool

How to do it...

Ceph cache tiering

Writeback mode

Read-only mode

Creating a pool for cache tiering

How to do it...

See also

Creating a cache tier

How to do it...

Configuring a cache tier

How to do it...

Testing a cache tier

How to do it...

Cache tiering – possible dangers in production environments

Known good workloads

Known bad workloads

The Virtual Storage Manager for Ceph

Introductionc 

Understanding the VSM architecture

The VSM controller

The VSM agent

Setting up the VSM environment

How to do it...

Getting ready for VSM

How to do it...

Installing VSM

How to do it...

Creating a Ceph cluster using VSM

How to do it...

Exploring the VSM dashboard

Upgrading the Ceph cluster using VSM

VSM roadmap

VSM resources

More on Ceph

Introduction

Disk performance baseline

Single disk write performance

How to do it...

Multiple disk write performance

How to do it...

Single disk read performance

How to do it...

Multiple disk read performance

How to do it...

Results

Baseline network performance

How to do it...

See also

Ceph rados bench

How to do it...

How it works...

RADOS load-gen

How to do it...

How it works...

There's more...

Benchmarking the Ceph Block Device

How to do it...

How it works...

See also

Benchmarking Ceph RBD using FIO

How to do it...

See Also

Ceph admin socket

How to do it...

Using the ceph tell command

How to do it...

Ceph REST API

How to do it...

Profiling Ceph memory

How to do it...

The ceph-objectstore-tool

How to do it...

How it works...

Using ceph-medic

How to do it...

How it works...

See also

Deploying the experimental Ceph BlueStore

How to do it...

See Also

An Introduction to Troubleshooting Ceph

Introduction

Initial troubleshooting and logging

How to do it...

Troubleshooting network issues

How to do it...

Troubleshooting monitors

How to do it...

Troubleshooting OSDs

How to do it...

Troubleshooting placement groups

How to do it...

There's more…

Upgrading Your Ceph Cluster from Hammer to Jewel

Introduction

Upgrading your Ceph cluster from Hammer to Jewel

How to do it...

Upgrading the Ceph monitor nodes

Upgrading the Ceph OSD nodes

Upgrading the Ceph Metadata Server

See also

Preface

So long are the days past of massively expensive black boxes and their large data center footprints. The current data-driven world we live in demands the ability to handle large-scale data growth at a more economical cost. Day-by-day data continues to grow exponentially and the need to store this data increases. This is where software-defined storage enters the picture.

The idea behind a software-defined storage solution is to utilize the intelligence of software combined with the use of commodity hardware to solve our future computing problems, including where to store all this data the human race is compiling, from music to insurance documents. The software-defined approach should be the answer to the future's computing problems and Ceph is the future of storage.

Ceph is a true open source, software-defined storage solution, purposely built to handle unprecedented data growth with linear performance improvement. It provides a unified storage experience for file, object, and block storage interfaces from the same system. The beauty of Ceph is its distributed, scalable nature and performance; reliability and robustness come along with these attributes. And furthermore it is pocket-friendly, that is, economical, providing you greater value for each dollar you spent.

Ceph is capable of providing block, object, and file access from a single storage solution, and its enterprise-class features such as scalability, reliability, erasure coding, and cache tiering have led organizations such as CERN, Yahoo!, and DreamHost to deploy and run Ceph highly successfully, for years. It is also currently being deployed in all flash storage scenarios, where low latency / high performance workloads, database workloads, storage for containers, and Hyper Converge Infrastructure as well. With Ceph BlueStore on the very nearhorizon,the best is truly yet to come for Ceph.

In this book, we will take a deep dive to understand Ceph—covering components and architecture, including its working. The Ceph Cookbook focuses on hands-on knowledge by providing you with step-by-step guidance with the help of recipes. Right from the first chapter, you will gain practical experience of Ceph by following the recipes. With each chapter, you will learn and play around with interesting concepts in Ceph. By the end of this book, you will feel competent regarding Ceph, both conceptually as well as practically, and you will be able to operate your Ceph storage infrastructure with confidence and success.

Best of luck in your future endeavors with Ceph!

What this book covers

Chapter 1, Ceph - Introduction and Beyond, covers an introduction to Ceph, gradually moving toward RAID and its challenges, and a Ceph architectural overview. Finally, we will go through Ceph installation and configuration. 

Chapter 2, Working with Ceph Block Device, covers an introduction to the Ceph Block Device and provisioning of the Ceph block device. We will also go through RBD snapshots and clones, as well as implementing a disaster-recovery solution with RBD mirroring.

Chapter 3, Working with Ceph and Openstack, covers configuring Openstack clients for use with Ceph, as well as storage options for OpenStack using cinder, glance, and nova.

Chapter 4, Working with Ceph Object Storage, covers a deep dive into Ceph object storage, including RGW setup and configuration, S3, and OpenStack Swift access. Finally, we will set up RGW with the Hadoop S3A plugin. 

Chapter 5, Working with Ceph Object Storage Multi-Site V2, helps you to deep dive into the new Multi-site V2, while configuring two Ceph clusters to mirror objects between them in an object disaster recovery solution. 

Chapter 6,Working with the Ceph Filesystem, covers an introduction to CephFS, deploying and accessing MDS and CephFS via kerenel, FUSE, and NFS-Ganesha.

Chapter 7, Monitoring Ceph Clusters, covers classic ways of monitoring Ceph via the Ceph command-line tools. You will also be introduced to Ceph Metrics and Grafana, and learn how to configure Ceph Metrics to monitor a Ceph cluster. 

Chapter 8, Operating and Managing a Ceph Cluster, covers Ceph service management with systemd, and scaling up and scaling down a Ceph cluster. This chapter also includes failed disk replacement and upgrading Ceph infrastructures.

Chapter 9, Ceph under the Hood, explores the Ceph CRUSH map, understanding the internals of the CRUSH map and CRUSH tunables, followed by Ceph authentication and authorization. This chapter also covers dynamic cluster management and understanding Ceph PG. Finally, we create the specifics required for specific hardware.

Chapter 10, Production Planning and Performance Tuning for Ceph, covers the planning of cluster production deployment and HW and SW planning for Ceph. This chapter also includes Ceph recommendation and performance tuning. Finally, this chapter covers erasure coding and cache tuning.

Chapter 11, The Virtual Storage Manager for Ceph, speaks about Virtual Storage Manager (VSM), covering it’s introduction and architecture. We will also go through the deployment of VSM and then the creation of a Ceph cluster, using VSM to manage it.

Chapter 12, More on Ceph, covers Ceph benchmarking, Ceph troubleshooting using admin socket, API, and the ceph-objectstore tool. This chapter also covers the deployment of Ceph using Ansible and Ceph memory profiling. Furthermore, it covers health checking your Ceph cluster using Ceph Medic and the new experimental backend Ceph BlueStore.

Chapter 13, An Introduction to Troubleshooting Ceph, covers troubleshooting common issues seen in Ceph clusters detailing methods to troubleshoot each component. This chapter also covers what to look for to determine where an issue is in the cluster and what the possible cause could be.

Chapter 14, Upgrading Your Ceph Cluster from Hammer to Jewel, covers upgrading the core components in your Ceph cluster from the Hammer release to the Jewel release.

What you need for this book

The various software components required to follow the instructions in the chapters are as follows:

VirtualBox 4.0 or higher (https://www.virtualbox.org/wiki/Downloads)

GIT (http://www.git-scm.com/downloads)

Vagrant 1.5.0 or higher (https://www.vagrantup.com/downloads.html)

CentOS operating system 7.0 or higher (http://wiki.centos.org/Download)

Ceph software Jewel packages Version 10.2.0 or higher (http://ceph.com/resources/downloads/)

S3 Client, typically S3cmd (http://s3tools.org/download)

Python-swift client

NFS Ganesha

Ceph Fuse

CephMetrics (https://github.com/ceph/cephmetrics)

Ceph-Medic (https://github.com/ceph/ceph-medic)

Virtual Storage Manager 2.0 or higher (https://github.com/01org/virtual-storagemanager/releases/tag/v2.1.0)

Ceph-Ansible (https://github.com/ceph/ceph-ansible)

OpenStack RDO (http://rdo.fedorapeople.org/rdo-release.rpm)

 

Who this book is for

This book is aimed at storage and cloud system engineers, system administrators, and technical architects and consultants who are interested in building software-defined storage solutions around Ceph to power their cloud and virtual infrastructure. If you have a basic knowledge of GNU/Linux and storage systems, with no experience of software-defined storage solutions and Ceph, but are eager to learn, this book is for you.

Sections

In this book, you will find several headings that appear frequently (Getting ready, How to do it…, How it works…, There's more…, and See also). To give clear instructions on how to complete a recipe, we use these sections as follows:

Getting ready

This section tells you what to expect in the recipe, and describes how to set up any software or any preliminary settings required for the recipe.

How to do it…

This section contains the steps required to follow the recipe.

How it works…

This section usually consists of a detailed explanation of what happened in the previous section.

There's more…

This section consists of additional information about the recipe in order to make the reader more knowledgeable about the recipe.

See also

This section provides helpful links to other useful information for the recipe.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning. Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Verify the installrc file" A block of code is set as follows:

AGENT_ADDRESS_LIST="192.168.123.101 192.168.123.102 192.168.123.103"

CONTROLLER_ADDRESS="192.168.123.100"

Any command-line input or output is written as follows:

# VBoxManage --version

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Select System info from the Administration panel."

Warnings or important notes appear like this.
Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of. To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message. If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors .

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.

Hover the mouse pointer on the

SUPPORT

tab at the top.

Click on

Code Downloads & Errata

.

Enter the name of the book in the

Search

box.

Select the book for which you're looking to download the code files.

Choose from the drop-down menu where you purchased this book from.

Click on

Code Download

.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account. Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for Windows

Zipeg / iZip / UnRarX for Mac

7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Ceph-Cookbook-Second-Edition. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/CephCookbookSecondEdition_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title. To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy. Please contact us at [email protected] with a link to the suspected pirated material. We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.

Ceph – Introduction and Beyond

In this chapter, we will cover the following recipes:

Ceph – the beginning of a new era

RAID – the end of an era

Ceph – the architectural overview

Planning a Ceph deployment

Setting up a virtual infrastructure

Installing and configuring Ceph

Scaling up your Ceph cluster

Using Ceph clusters with a hands-on approach

Introduction

Ceph is currently the hottest software-defined storage (SDS) technology and is shaking up the entire storage industry. It is an open source project that provides unified software-defined solutions for block, file, and object storage. The core idea of Ceph is to provide a distributed storage system that is massively scalable and high performing with no single point of failure. From the roots, it has been designed to be highly scalable (up to the exabyte level and beyond) while running on general-purpose commodity hardware.

Ceph is acquiring most of the traction in the storage industry due to its open, scalable, and reliable nature. This is the era of cloud computing and software-defined infrastructure, where we need a storage backend that is purely software-defined and, more importantly, cloud-ready. Ceph fits in here very well, regardless of whether you are running a public, private, or hybrid cloud.

Today's software systems are very smart and make the best use of commodity hardware to run gigantic-scale infrastructures. Ceph is one of them; it intelligently uses commodity hardware to provide enterprise-grade robust and highly reliable storage systems.

Ceph has been raised and nourished with the help of the Ceph upstream community with an architectural philosophy that includes the following:

Every component must scale linearly

There should not be any single point of failure

The solution must be software-based, open source, and adaptable

The Ceph software should run on readily available commodity hardware

Every component must be self-managing and self-healing wherever possible

The foundation of Ceph lies in objects, which are its building blocks. Object storage such as Ceph is the perfect provision for current and future needs for unstructured data storage. Object storage has its advantages over traditional storage solutions; we can achieve platform and hardware independence using object storage. Ceph plays meticulously with objects and replicates them across the cluster to avail reliability; in Ceph, objects are not tied to a physical path, making object location independent. This flexibility enables Ceph to scale linearly from the petabyte to the exabyte level.

Ceph provides great performance, enormous scalability, power, and flexibility to organizations. It helps them get rid of expensive proprietary storage silos. Ceph is indeed an enterprise-class storage solution that runs on commodity hardware; it is a low-cost yet feature-rich storage system. Ceph's universal storage system provides block, file, and object storage under one hood, enabling customers to use storage as they want.

In the following section we will learn about Ceph releases.

Ceph is being developed and improved at a rapid pace. On July 3, 2012, Sage announced the first LTS release of Ceph with the code name Argonaut. Since then, we have seen 12 new releases come up. Ceph releases are categorized as Long Term Support (LTS), and stable releases and every alternate Ceph release are LTS releases. For more information, visit https://Ceph.com/category/releases/.

Ceph release name

Ceph release version

Released On

Argonaut

V0.48 (LTS)

July 3, 2012

Bobtail

V0.56 (LTS)

January 1, 2013

Cuttlefish

V0.61

May 7, 2013

Dumpling

V0.67 (LTS)

August 14, 2013

Emperor

V0.72

November 9, 2013

Firefly

V0.80 (LTS)

May 7, 2014

Giant

V0.87.1

Feb 26, 2015

Hammer

V0.94 (LTS)

April 7, 2015

Infernalis

V9.0.0

May 5, 2015

Jewel

V10.0.0 (LTS)

Nov, 2015

Kraken

V11.0.0

June 2016

Luminous

V12.0.0 (LTS)

Feb 2017

Here is a fact: Ceph release names follow an alphabetic order; the next one will be an M release. The term Ceph is a common nickname given to pet octopuses and is considered a short form of Cephalopod, which is a class of marine animals that belong to the mollusk phylum. Ceph has octopuses as its mascot, which represents Ceph's highly parallel behavior, similar to octopuses.

Ceph – the beginning of a new era

Data storage requirements have grown explosively over the last few years. Research shows that data in large organizations is growing at a rate of 40 to 60 percent annually, and many companies are doubling their data footprint each year. IDC analysts have estimated that worldwide, there were 54.4 exabytes of total digital data in the year 2000. By 2007, this reached 295 exabytes, and by 2020, it's expected to reach 44 zettabytes worldwide. Such data growth cannot be managed by traditional storage systems; we need a system such as Ceph, which is distributed, scalable and most importantly, economically viable. Ceph has been especially designed to handle today's as well as the future's data storage needs.

Software-defined storage – SDS

SDS is what is needed to reduce TCO for your storage infrastructure. In addition to reduced storage cost, SDS can offer flexibility, scalability, and reliability. Ceph is a true SDS solution; it runs on commodity hardware with no vendor lock-in and provides low cost per GB. Unlike traditional storage systems, where hardware gets married to software, in SDS, you are free to choose commodity hardware from any manufacturer and are free to design a heterogeneous hardware solution for your own needs. Ceph's software-defined storage on top of this hardware provides all the intelligence you need and will take care of everything, providing all the enterprise storage features right from the software layer.

Cloud storage

One of the drawbacks of a cloud infrastructure is the storage. Every cloud infrastructure needs a storage system that is reliable, low-cost, and scalable with a tighter integration than its other cloud components. There are many traditional storage solutions out there in the market that claim to be cloud-ready, but today, we not only need cloud readiness, but also a lot more beyond that. We need a storage system that should be fully integrated with cloud systems and can provide lower TCO without any compromise to reliability and scalability. Cloud systems are software-defined and are built on top of commodity hardware; similarly, it needs a storage system that follows the same methodology, that is, being software-defined on top of commodity hardware, and Ceph is the best choice available for cloud use cases.

Ceph has been rapidly evolving and bridging the gap of a true cloud storage backend. It is grabbing the center stage with every major open source cloud platform, namely OpenStack, CloudStack, and OpenNebula. Moreover, Ceph has succeeded in building up beneficial partnerships with cloud vendors such as Red Hat, Canonical, Mirantis, SUSE, and many more. These companies are favoring Ceph big time and including it as an official storage backend for their cloud OpenStack distributions, thus making Ceph a red-hot technology in cloud storage space.

The OpenStack project is one of the finest examples of open source software powering public and private clouds. It has proven itself as an end-to-end open source cloud solution. OpenStack is a collection of programs, such as Cinder, Glance, and Swift, which provide storage capabilities to OpenStack. These OpenStack components require a reliable, scalable, and all in one storage backend such as Ceph. For this reason, OpenStack and Ceph communities have been working together for many years to develop a fully compatible Ceph storage backend for the OpenStack.

Cloud infrastructure based on Ceph provides much-needed flexibility to service providers to build Storage-as-a-Service and Infrastructure-as-a-Service solutions, which they cannot achieve from other traditional enterprise storage solutions as they are not designed to fulfill cloud needs. Using Ceph, service providers can offer low-cost, reliable cloud storage to their customers.

Unified next-generation storage architecture

The definition of unified storage has changed lately. A few years ago, the term unified storage referred to providing file and block storage from a single system. Now because of recent technological advancements, such as cloud computing, big data, and internet of Things, a new kind of storage has been evolving, that is, object storage. Thus, all storage systems that do not support object storage are not really unified storage solutions. A true unified storage is like Ceph; it supports blocks, files, and object storage from a single system.

In Ceph, the term unified storage is more meaningful than what existing storage vendors claim to provide. It has been designed from the ground up to be future-ready, and it's constructed such that it can handle enormous amounts of data. When we call Ceph future ready, we mean to focus on its object storage capabilities, which is a better fit for today's mix of unstructured data rather than blocks or files. Everything in Ceph relies on intelligent objects, whether it's block storage or file storage. Rather than managing blocks and files underneath, Ceph manages objects and supports block-and-file-based storage on top of it. Objects provide enormous scaling with increased performance by eliminating metadata operations. Ceph uses an algorithm to dynamically compute where the object should be stored and retrieved from.