41,99 €
Over 100 effective recipes to help you design, implement, and troubleshoot manage the software-defined and massively scalable Ceph storage system.
This book is targeted at storage and cloud engineers, system administrators, or anyone who is interested in building software defined storage, to power your cloud or virtual infrastructure.
If you have basic knowledge of GNU/Linux and storage systems, with no experience of software defined storage solutions and Ceph, but eager to learn then this book is for you
Ceph is a unified distributed storage system designed for reliability and scalability. This technology has been transforming the software-defined storage industry and is evolving rapidly as a leader with its wide range of support for popular cloud platforms such as OpenStack, and CloudStack, and also for virtualized platforms. Ceph is backed by Red Hat and has been developed by community of developers which has gained immense traction in recent years.
This book will guide you right from the basics of Ceph , such as creating blocks, object storage, and filesystem access, to advanced concepts such as cloud integration solutions. The book will also cover practical and easy to implement recipes on CephFS, RGW, and RBD with respect to the major stable release of Ceph Jewel. Towards the end of the book, recipes based on troubleshooting and best practices will help you get to grips with managing Ceph storage in a production environment.
By the end of this book, you will have practical, hands-on experience of using Ceph efficiently for your storage requirements.
This step-by-step guide is filled with practical tutorials, making complex scenarios easy to understand.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 374
Veröffentlichungsjahr: 2017
BIRMINGHAM - MUMBAI
Copyright © 2017 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: February 2016
Second edition: November 2017
Production reference: 1221117
ISBN 978-1-78839-106-1
www.packtpub.com
Authors
Vikhyat Umrao
Michael Hackett
Karan Singh
Copy Editors
Safis Editing
Juliana Nair
Reviewer
Álvaro Soto
Project Coordinator
Judie Jose
Commissioning Editor
Gebin George
Proofreader
Safis Editing
Acquisition Editor
Shrilekha Inani
Indexer
Pratik Shirodkar
Content Development Editor
Nikita Pawar
Graphics
Tania Dutta
Technical Editor
Mohd Riyan Khan
Production Coordinator
Deepika Naik
The views expressed in this book are those of the authors and not of Red Hat.
Since Ceph's inception and posting to GitHub in 2008, Sage Weil’s creation has grown from an individual idea to a successful open source project with over 87,000 commits from almost 900 different contributors from dozens of companies. Originally incubated as a skunkworks project inside DreamHost, an earlier Sage startup focused on web hosting. In 2012, it was spun off into a new company, Inktank, dedicated to focusing exclusively on the development and support of Ceph. Inktank’s reputation was built upon the stellar customer support its employees provided, from on-site installation and configuration, to highly complex bug troubleshooting, through patch development, and ultimate problem resolution. The DNA of the company was a dedication to customer success, even when it required a senior developer to join a customer support teleconference at short notice, or even Sage himself to remotely log in to a server and assist in diagnosing the root cause of an issue.
This focus on the customer was elevated even further after Inktank’s acquisition by Red Hat in 2014. If Red Hat is known for anything, it’s for making CIOs comfortable with open source software. Members of Red Hat’s customer experience and engagement team are some of the most talented individuals I’ve had the pleasure to work with. They possess the unique ability to blend the technical troubleshooting skills necessary to support a complex distributed storage system with the soft relational skills required to be on the front lines engaging with a customer, who in many cases is in an extremely stressful situation where production clusters are out of operation.
The authors of this book are two of the finest exemplars of this unique crossing of streams. Inside this work, Vikhyat and Michael share some of their hard-earned best practices in successfully installing, configuring, and supporting a Ceph cluster. Vikhyat has 9 years of experience providing sustaining engineering support for distributed storage products with a focus on Ceph for over 3 years. Michael has been working in the storage industry for over 12 years and has been focused on Ceph for close to 3 years. They both have the uncommon ability to calmly work through complex customer escalations, providing a first class experience with Ceph under even the most stressful of situations. Between the two of them, you’re in good hands—the ones that have seen some of the hairiest, most difficult-to-diagnose problems and have come out the other side to share their hard-earned wisdom with you.
If there’s been one frequent critique of Ceph over the years, it’s been that it’s too complex for a typical administrator to work with. Our hope is that the ones who might be intimidated by the thought of setting up their first Ceph cluster will find comfort and gain confidence from reading this book. After all, there’s no time like the present to start playing with The Future of Storage™. :-)
Ian R. Colle
Global Director of Software Engineering, Red Hat Ceph Storage
Vikhyat Umrao has 9 years of experience with distributed storage products as a sustenance engineer and in the last couple of years, he has been working on software-defined storage technology, with specific expertise in Ceph Unified Storage. He has been working on Ceph for over 3 years now and in his current position at Red Hat, he focuses on the support and development of Ceph to solve Red Hat Ceph storage customer issues and upstream reported issues.
He is based in the Greater Boston area, where he is a principal software maintenance engineer for Red Hat Ceph Storage. Vikhyat lives with his wife, Pratima, and he likes to explore new places.
Michael Hackett is a storage and SAN expert in customer support. He has been working on Ceph and storage-related products for over 12 years. Apart from this, he holds several storage and SAN-based certifications, and prides himself on his ability to troubleshoot and adapt to new complex issues.
Michael is currently working at Red Hat, based in Massachusetts, where he is a principal software maintenance engineer for Red Hat Ceph and the technical product lead for the global Ceph team.
Michael lives in Massachusetts with his wife, Nicole, his two sons, and their dog. He is an avid sports fan and enjoys time with his family.
Karan Singh devotes a part of his time in learning emerging technologies and enjoys the challenges that come with it. He loves tech writing and is an avid blogger. He also authored the first edition of Learning Ceph and Ceph Cookbook, Packt Publishing. You can reach him on Twitter at @karansingh010.
Álvaro Soto is a cloud and open source enthusiast. He was born in Chile, but he has been living in Mexico for more than 10 years now. He is an active member of OpenStack and Ceph communities in Mexico. He holds an engineering degree in computer science from Instituto Politécnico Nacional (IPN México) and he is on his way to getting a master's degree in computer science at Instituto Tecnológico Autónomo de México (ITAM).
Álvaro currently works as a Ceph consultant at Sentinel.la doing architecting, implementing, and performance tuning on Ceph clusters, data migration, and new ways to adopt Ceph solutions.
He enjoys his time reading about distributing systems, automation, Linux, security-related papers and books. You can always contact him by email [email protected], on IRC using the nickname khyr0n, or on Twitter @alsotoes.
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www.packtpub.com/mapt
Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.
Fully searchable across every book published by Packt
Copy and paste, print, and bookmark content
On demand and accessible via a web browser
Thanks for purchasing this Packt book. At Packt, quality is at the heart of our editorial process. To help us improve, please leave us an honest review on this book's Amazon page at https://www.amazon.com/dp/1788391063.
If you'd like to join our team of regular reviewers, you can email us at [email protected]. We award our regular reviewers with free eBooks and videos in exchange for their valuable feedback. Help us be relentless in improving our products!
Preface
What this book covers
What you need for this book
Who this book is for
Sections
Getting ready
How to do it…
How it works…
There's more…
See also
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
Ceph – Introduction and Beyond
Introduction
Ceph – the beginning of a new era
Software-defined storage – SDS
Cloud storage
Unified next-generation storage architecture
RAID – the end of an era
RAID rebuilds are painful
RAID spare disks increases TCO
RAID can be expensive and hardware dependent
The growing RAID group is a challenge
The RAID reliability model is no longer promising
Ceph – the architectural overview
Planning a Ceph deployment
Setting up a virtual infrastructure
Getting ready
How to do it...
Installing and configuring Ceph
Creating the Ceph cluster on ceph-node1
How to do it...
Scaling up your Ceph cluster
How to do it…
Using the Ceph cluster with a hands-on approach
How to do it...
Working with Ceph Block Device
Introduction
Configuring Ceph client
How to do it...
Creating Ceph Block Device
How to do it...
Mapping Ceph Block Device
How to do it...
Resizing Ceph RBD
How to do it...
Working with RBD snapshots
How to do it...
Working with RBD clones
How to do it...
Disaster recovery replication using RBD mirroring
How to do it...
Configuring pools for RBD mirroring with one way replication
How to do it...
Configuring image mirroring
How to do it...
Configuring two-way mirroring
How to do it...
See also
Recovering from a disaster!
How to do it...
Working with Ceph and OpenStack
Introduction
Ceph – the best match for OpenStack
Setting up OpenStack
How to do it...
Configuring OpenStack as Ceph clients
How to do it...
Configuring Glance for Ceph backend
How to do it…
Configuring Cinder for Ceph backend
How to do it...
Configuring Nova to boot instances from Ceph RBD
How to do it…
Configuring Nova to attach Ceph RBD
How to do it...
Working with Ceph Object Storage
Introduction
Understanding Ceph object storage
RADOS Gateway standard setup, installation, and configuration
Setting up the RADOS Gateway node
How to do it…
Installing and configuring the RADOS Gateway
How to do it…
Creating the radosgw user
How to do it…
See also…
Accessing the Ceph object storage using S3 API
How to do it…
Configuring DNS
Configuring the s3cmd client
Configure the S3 client (s3cmd) on client-node1
Accessing the Ceph object storage using the Swift API
How to do it...
Integrating RADOS Gateway with OpenStack Keystone
How to do it...
Integrating RADOS Gateway with Hadoop S3A plugin
How to do it...
Working with Ceph Object Storage Multi-Site v2
Introduction
Functional changes from Hammer federated configuration
RGW multi-site v2 requirement
Installing the Ceph RGW multi-site v2 environment
How to do it...
Configuring Ceph RGW multi-site v2
How to do it...
Configuring a master zone
Configuring a secondary zone
Checking the synchronization status
Testing user, bucket, and object sync between master and secondary sites
How to do it...
Working with the Ceph Filesystem
Introduction
Understanding the Ceph Filesystem and MDS
Deploying Ceph MDS
How to do it...
Accessing Ceph FS through kernel driver
How to do it...
Accessing Ceph FS through FUSE client
How to do it...
Exporting the Ceph Filesystem as NFS
How to do it...
Ceph FS – a drop-in replacement for HDFS
Monitoring Ceph Clusters
Introduction
Monitoring Ceph clusters – the classic way
How to do it...
Checking the cluster's health
Monitoring cluster events
The cluster utilization statistics
Checking the cluster's status
The cluster authentication entries
Monitoring Ceph MON
How to do it...
Checking the MON status
Checking the MON quorum status
Monitoring Ceph OSDs
How to do it...
OSD tree view
OSD statistics
Checking the CRUSH map
Monitoring PGs
Monitoring Ceph MDS
How to do it...
Introducing Ceph Metrics and Grafana
collectd
Grafana
Installing and configuring Ceph Metrics with the Grafana dashboard
How to do it...
Monitoring Ceph clusters with Ceph Metrics with the Grafana dashboard
How to do it ...
Operating and Managing a Ceph Cluster
Introduction
Understanding Ceph service management
Managing the cluster configuration file
How to do it...
Adding monitor nodes to the Ceph configuration file
Adding an MDS node to the Ceph configuration file
Adding OSD nodes to the Ceph configuration file
Running Ceph with systemd
How to do it...
Starting and stopping all daemons
Querying systemd units on a node
Starting and stopping all daemons by type
Starting and stopping a specific daemon
Scale-up versus scale-out
Scaling out your Ceph cluster
How to do it...
Adding the Ceph OSD
Adding the Ceph MON
There's more...
Scaling down your Ceph cluster
How to do it...
Removing the Ceph OSD
Removing the Ceph MON
Replacing a failed disk in the Ceph cluster
How to do it...
Upgrading your Ceph cluster
How to do it...
Maintaining a Ceph cluster
How to do it...
How it works...
Throttle the backfill and recovery:
Ceph under the Hood
Introduction
Ceph scalability and high availability
Understanding the CRUSH mechanism
CRUSH map internals
How to do it...
How it works...
CRUSH tunables
The evolution of CRUSH tunables
Argonaut – legacy
Bobtail – CRUSH_TUNABLES2
Firefly – CRUSH_TUNABLES3
Hammer – CRUSH_V4
Jewel – CRUSH_TUNABLES5
Ceph and kernel versions that support given tunables
Warning when tunables are non-optimal
A few important points
Ceph cluster map
High availability monitors
Ceph authentication and authorization
Ceph authentication
Ceph authorization
How to do it…
I/O path from a Ceph client to a Ceph cluster
Ceph Placement Group
How to do it…
Placement Group states
Creating Ceph pools on specific OSDs
How to do it...
Production Planning and Performance Tuning for Ceph
Introduction
The dynamics of capacity, performance, and cost
Choosing hardware and software components for Ceph
Processor
Memory
Network
Disk
Partitioning the Ceph OSD journal
Partitioning Ceph OSD data
Operating system
OSD filesystem
Ceph recommendations and performance tuning
Tuning global clusters
Tuning Monitor
OSD tuning
OSD general settings
OSD journal settings
OSD filestore settings
OSD recovery settings
OSD backfilling settings
OSD scrubbing settings
Tuning the client
Tuning the operating system
Tuning the network
Sample tuning profile for OSD nodes
How to do it...
Ceph erasure-coding
Erasure code plugin
Creating an erasure-coded pool
How to do it...
Ceph cache tiering
Writeback mode
Read-only mode
Creating a pool for cache tiering
How to do it...
See also
Creating a cache tier
How to do it...
Configuring a cache tier
How to do it...
Testing a cache tier
How to do it...
Cache tiering – possible dangers in production environments
Known good workloads
Known bad workloads
The Virtual Storage Manager for Ceph
Introductionc
Understanding the VSM architecture
The VSM controller
The VSM agent
Setting up the VSM environment
How to do it...
Getting ready for VSM
How to do it...
Installing VSM
How to do it...
Creating a Ceph cluster using VSM
How to do it...
Exploring the VSM dashboard
Upgrading the Ceph cluster using VSM
VSM roadmap
VSM resources
More on Ceph
Introduction
Disk performance baseline
Single disk write performance
How to do it...
Multiple disk write performance
How to do it...
Single disk read performance
How to do it...
Multiple disk read performance
How to do it...
Results
Baseline network performance
How to do it...
See also
Ceph rados bench
How to do it...
How it works...
RADOS load-gen
How to do it...
How it works...
There's more...
Benchmarking the Ceph Block Device
How to do it...
How it works...
See also
Benchmarking Ceph RBD using FIO
How to do it...
See Also
Ceph admin socket
How to do it...
Using the ceph tell command
How to do it...
Ceph REST API
How to do it...
Profiling Ceph memory
How to do it...
The ceph-objectstore-tool
How to do it...
How it works...
Using ceph-medic
How to do it...
How it works...
See also
Deploying the experimental Ceph BlueStore
How to do it...
See Also
An Introduction to Troubleshooting Ceph
Introduction
Initial troubleshooting and logging
How to do it...
Troubleshooting network issues
How to do it...
Troubleshooting monitors
How to do it...
Troubleshooting OSDs
How to do it...
Troubleshooting placement groups
How to do it...
There's more…
Upgrading Your Ceph Cluster from Hammer to Jewel
Introduction
Upgrading your Ceph cluster from Hammer to Jewel
How to do it...
Upgrading the Ceph monitor nodes
Upgrading the Ceph OSD nodes
Upgrading the Ceph Metadata Server
See also
So long are the days past of massively expensive black boxes and their large data center footprints. The current data-driven world we live in demands the ability to handle large-scale data growth at a more economical cost. Day-by-day data continues to grow exponentially and the need to store this data increases. This is where software-defined storage enters the picture.
The idea behind a software-defined storage solution is to utilize the intelligence of software combined with the use of commodity hardware to solve our future computing problems, including where to store all this data the human race is compiling, from music to insurance documents. The software-defined approach should be the answer to the future's computing problems and Ceph is the future of storage.
Ceph is a true open source, software-defined storage solution, purposely built to handle unprecedented data growth with linear performance improvement. It provides a unified storage experience for file, object, and block storage interfaces from the same system. The beauty of Ceph is its distributed, scalable nature and performance; reliability and robustness come along with these attributes. And furthermore it is pocket-friendly, that is, economical, providing you greater value for each dollar you spent.
Ceph is capable of providing block, object, and file access from a single storage solution, and its enterprise-class features such as scalability, reliability, erasure coding, and cache tiering have led organizations such as CERN, Yahoo!, and DreamHost to deploy and run Ceph highly successfully, for years. It is also currently being deployed in all flash storage scenarios, where low latency / high performance workloads, database workloads, storage for containers, and Hyper Converge Infrastructure as well. With Ceph BlueStore on the very nearhorizon,the best is truly yet to come for Ceph.
In this book, we will take a deep dive to understand Ceph—covering components and architecture, including its working. The Ceph Cookbook focuses on hands-on knowledge by providing you with step-by-step guidance with the help of recipes. Right from the first chapter, you will gain practical experience of Ceph by following the recipes. With each chapter, you will learn and play around with interesting concepts in Ceph. By the end of this book, you will feel competent regarding Ceph, both conceptually as well as practically, and you will be able to operate your Ceph storage infrastructure with confidence and success.
Best of luck in your future endeavors with Ceph!
Chapter 1, Ceph - Introduction and Beyond, covers an introduction to Ceph, gradually moving toward RAID and its challenges, and a Ceph architectural overview. Finally, we will go through Ceph installation and configuration.
Chapter 2, Working with Ceph Block Device, covers an introduction to the Ceph Block Device and provisioning of the Ceph block device. We will also go through RBD snapshots and clones, as well as implementing a disaster-recovery solution with RBD mirroring.
Chapter 3, Working with Ceph and Openstack, covers configuring Openstack clients for use with Ceph, as well as storage options for OpenStack using cinder, glance, and nova.
Chapter 4, Working with Ceph Object Storage, covers a deep dive into Ceph object storage, including RGW setup and configuration, S3, and OpenStack Swift access. Finally, we will set up RGW with the Hadoop S3A plugin.
Chapter 5, Working with Ceph Object Storage Multi-Site V2, helps you to deep dive into the new Multi-site V2, while configuring two Ceph clusters to mirror objects between them in an object disaster recovery solution.
Chapter 6,Working with the Ceph Filesystem, covers an introduction to CephFS, deploying and accessing MDS and CephFS via kerenel, FUSE, and NFS-Ganesha.
Chapter 7, Monitoring Ceph Clusters, covers classic ways of monitoring Ceph via the Ceph command-line tools. You will also be introduced to Ceph Metrics and Grafana, and learn how to configure Ceph Metrics to monitor a Ceph cluster.
Chapter 8, Operating and Managing a Ceph Cluster, covers Ceph service management with systemd, and scaling up and scaling down a Ceph cluster. This chapter also includes failed disk replacement and upgrading Ceph infrastructures.
Chapter 9, Ceph under the Hood, explores the Ceph CRUSH map, understanding the internals of the CRUSH map and CRUSH tunables, followed by Ceph authentication and authorization. This chapter also covers dynamic cluster management and understanding Ceph PG. Finally, we create the specifics required for specific hardware.
Chapter 10, Production Planning and Performance Tuning for Ceph, covers the planning of cluster production deployment and HW and SW planning for Ceph. This chapter also includes Ceph recommendation and performance tuning. Finally, this chapter covers erasure coding and cache tuning.
Chapter 11, The Virtual Storage Manager for Ceph, speaks about Virtual Storage Manager (VSM), covering it’s introduction and architecture. We will also go through the deployment of VSM and then the creation of a Ceph cluster, using VSM to manage it.
Chapter 12, More on Ceph, covers Ceph benchmarking, Ceph troubleshooting using admin socket, API, and the ceph-objectstore tool. This chapter also covers the deployment of Ceph using Ansible and Ceph memory profiling. Furthermore, it covers health checking your Ceph cluster using Ceph Medic and the new experimental backend Ceph BlueStore.
Chapter 13, An Introduction to Troubleshooting Ceph, covers troubleshooting common issues seen in Ceph clusters detailing methods to troubleshoot each component. This chapter also covers what to look for to determine where an issue is in the cluster and what the possible cause could be.
Chapter 14, Upgrading Your Ceph Cluster from Hammer to Jewel, covers upgrading the core components in your Ceph cluster from the Hammer release to the Jewel release.
The various software components required to follow the instructions in the chapters are as follows:
VirtualBox 4.0 or higher (https://www.virtualbox.org/wiki/Downloads)
GIT (http://www.git-scm.com/downloads)
Vagrant 1.5.0 or higher (https://www.vagrantup.com/downloads.html)
CentOS operating system 7.0 or higher (http://wiki.centos.org/Download)
Ceph software Jewel packages Version 10.2.0 or higher (http://ceph.com/resources/downloads/)
S3 Client, typically S3cmd (http://s3tools.org/download)
Python-swift client
NFS Ganesha
Ceph Fuse
CephMetrics (https://github.com/ceph/cephmetrics)
Ceph-Medic (https://github.com/ceph/ceph-medic)
Virtual Storage Manager 2.0 or higher (https://github.com/01org/virtual-storagemanager/releases/tag/v2.1.0)
Ceph-Ansible (https://github.com/ceph/ceph-ansible)
OpenStack RDO (http://rdo.fedorapeople.org/rdo-release.rpm)
This book is aimed at storage and cloud system engineers, system administrators, and technical architects and consultants who are interested in building software-defined storage solutions around Ceph to power their cloud and virtual infrastructure. If you have a basic knowledge of GNU/Linux and storage systems, with no experience of software-defined storage solutions and Ceph, but are eager to learn, this book is for you.
In this book, you will find several headings that appear frequently (Getting ready, How to do it…, How it works…, There's more…, and See also). To give clear instructions on how to complete a recipe, we use these sections as follows:
This section tells you what to expect in the recipe, and describes how to set up any software or any preliminary settings required for the recipe.
This section contains the steps required to follow the recipe.
This section usually consists of a detailed explanation of what happened in the previous section.
This section consists of additional information about the recipe in order to make the reader more knowledgeable about the recipe.
This section provides helpful links to other useful information for the recipe.
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning. Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Verify the installrc file" A block of code is set as follows:
AGENT_ADDRESS_LIST="192.168.123.101 192.168.123.102 192.168.123.103"
CONTROLLER_ADDRESS="192.168.123.100"
Any command-line input or output is written as follows:
# VBoxManage --version
New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Select System info from the Administration panel."
Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of. To send us general feedback, simply e-mail [email protected], and mention the book's title in the subject of your message. If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors .
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. You can download the code files by following these steps:
Log in or register to our website using your e-mail address and password.
Hover the mouse pointer on the
SUPPORT
tab at the top.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box.
Select the book for which you're looking to download the code files.
Choose from the drop-down menu where you purchased this book from.
Click on
Code Download
.
You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account. Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR / 7-Zip for Windows
Zipeg / iZip / UnRarX for Mac
7-Zip / PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Ceph-Cookbook-Second-Edition. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/CephCookbookSecondEdition_ColorImages.pdf.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books-maybe a mistake in the text or the code-we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title. To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy. Please contact us at [email protected] with a link to the suspected pirated material. We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at [email protected], and we will do our best to address the problem.
In this chapter, we will cover the following recipes:
Ceph – the beginning of a new era
RAID – the end of an era
Ceph – the architectural overview
Planning a Ceph deployment
Setting up a virtual infrastructure
Installing and configuring Ceph
Scaling up your Ceph cluster
Using Ceph clusters with a hands-on approach
Ceph is currently the hottest software-defined storage (SDS) technology and is shaking up the entire storage industry. It is an open source project that provides unified software-defined solutions for block, file, and object storage. The core idea of Ceph is to provide a distributed storage system that is massively scalable and high performing with no single point of failure. From the roots, it has been designed to be highly scalable (up to the exabyte level and beyond) while running on general-purpose commodity hardware.
Ceph is acquiring most of the traction in the storage industry due to its open, scalable, and reliable nature. This is the era of cloud computing and software-defined infrastructure, where we need a storage backend that is purely software-defined and, more importantly, cloud-ready. Ceph fits in here very well, regardless of whether you are running a public, private, or hybrid cloud.
Today's software systems are very smart and make the best use of commodity hardware to run gigantic-scale infrastructures. Ceph is one of them; it intelligently uses commodity hardware to provide enterprise-grade robust and highly reliable storage systems.
Ceph has been raised and nourished with the help of the Ceph upstream community with an architectural philosophy that includes the following:
Every component must scale linearly
There should not be any single point of failure
The solution must be software-based, open source, and adaptable
The Ceph software should run on readily available commodity hardware
Every component must be self-managing and self-healing wherever possible
The foundation of Ceph lies in objects, which are its building blocks. Object storage such as Ceph is the perfect provision for current and future needs for unstructured data storage. Object storage has its advantages over traditional storage solutions; we can achieve platform and hardware independence using object storage. Ceph plays meticulously with objects and replicates them across the cluster to avail reliability; in Ceph, objects are not tied to a physical path, making object location independent. This flexibility enables Ceph to scale linearly from the petabyte to the exabyte level.
Ceph provides great performance, enormous scalability, power, and flexibility to organizations. It helps them get rid of expensive proprietary storage silos. Ceph is indeed an enterprise-class storage solution that runs on commodity hardware; it is a low-cost yet feature-rich storage system. Ceph's universal storage system provides block, file, and object storage under one hood, enabling customers to use storage as they want.
In the following section we will learn about Ceph releases.
Ceph is being developed and improved at a rapid pace. On July 3, 2012, Sage announced the first LTS release of Ceph with the code name Argonaut. Since then, we have seen 12 new releases come up. Ceph releases are categorized as Long Term Support (LTS), and stable releases and every alternate Ceph release are LTS releases. For more information, visit https://Ceph.com/category/releases/.
Ceph release name
Ceph release version
Released On
Argonaut
V0.48 (LTS)
July 3, 2012
Bobtail
V0.56 (LTS)
January 1, 2013
Cuttlefish
V0.61
May 7, 2013
Dumpling
V0.67 (LTS)
August 14, 2013
Emperor
V0.72
November 9, 2013
Firefly
V0.80 (LTS)
May 7, 2014
Giant
V0.87.1
Feb 26, 2015
Hammer
V0.94 (LTS)
April 7, 2015
Infernalis
V9.0.0
May 5, 2015
Jewel
V10.0.0 (LTS)
Nov, 2015
Kraken
V11.0.0
June 2016
Luminous
V12.0.0 (LTS)
Feb 2017
Data storage requirements have grown explosively over the last few years. Research shows that data in large organizations is growing at a rate of 40 to 60 percent annually, and many companies are doubling their data footprint each year. IDC analysts have estimated that worldwide, there were 54.4 exabytes of total digital data in the year 2000. By 2007, this reached 295 exabytes, and by 2020, it's expected to reach 44 zettabytes worldwide. Such data growth cannot be managed by traditional storage systems; we need a system such as Ceph, which is distributed, scalable and most importantly, economically viable. Ceph has been especially designed to handle today's as well as the future's data storage needs.
SDS is what is needed to reduce TCO for your storage infrastructure. In addition to reduced storage cost, SDS can offer flexibility, scalability, and reliability. Ceph is a true SDS solution; it runs on commodity hardware with no vendor lock-in and provides low cost per GB. Unlike traditional storage systems, where hardware gets married to software, in SDS, you are free to choose commodity hardware from any manufacturer and are free to design a heterogeneous hardware solution for your own needs. Ceph's software-defined storage on top of this hardware provides all the intelligence you need and will take care of everything, providing all the enterprise storage features right from the software layer.
One of the drawbacks of a cloud infrastructure is the storage. Every cloud infrastructure needs a storage system that is reliable, low-cost, and scalable with a tighter integration than its other cloud components. There are many traditional storage solutions out there in the market that claim to be cloud-ready, but today, we not only need cloud readiness, but also a lot more beyond that. We need a storage system that should be fully integrated with cloud systems and can provide lower TCO without any compromise to reliability and scalability. Cloud systems are software-defined and are built on top of commodity hardware; similarly, it needs a storage system that follows the same methodology, that is, being software-defined on top of commodity hardware, and Ceph is the best choice available for cloud use cases.
Ceph has been rapidly evolving and bridging the gap of a true cloud storage backend. It is grabbing the center stage with every major open source cloud platform, namely OpenStack, CloudStack, and OpenNebula. Moreover, Ceph has succeeded in building up beneficial partnerships with cloud vendors such as Red Hat, Canonical, Mirantis, SUSE, and many more. These companies are favoring Ceph big time and including it as an official storage backend for their cloud OpenStack distributions, thus making Ceph a red-hot technology in cloud storage space.
The OpenStack project is one of the finest examples of open source software powering public and private clouds. It has proven itself as an end-to-end open source cloud solution. OpenStack is a collection of programs, such as Cinder, Glance, and Swift, which provide storage capabilities to OpenStack. These OpenStack components require a reliable, scalable, and all in one storage backend such as Ceph. For this reason, OpenStack and Ceph communities have been working together for many years to develop a fully compatible Ceph storage backend for the OpenStack.
Cloud infrastructure based on Ceph provides much-needed flexibility to service providers to build Storage-as-a-Service and Infrastructure-as-a-Service solutions, which they cannot achieve from other traditional enterprise storage solutions as they are not designed to fulfill cloud needs. Using Ceph, service providers can offer low-cost, reliable cloud storage to their customers.
The definition of unified storage has changed lately. A few years ago, the term unified storage referred to providing file and block storage from a single system. Now because of recent technological advancements, such as cloud computing, big data, and internet of Things, a new kind of storage has been evolving, that is, object storage. Thus, all storage systems that do not support object storage are not really unified storage solutions. A true unified storage is like Ceph; it supports blocks, files, and object storage from a single system.
In Ceph, the term unified storage is more meaningful than what existing storage vendors claim to provide. It has been designed from the ground up to be future-ready, and it's constructed such that it can handle enormous amounts of data. When we call Ceph future ready, we mean to focus on its object storage capabilities, which is a better fit for today's mix of unstructured data rather than blocks or files. Everything in Ceph relies on intelligent objects, whether it's block storage or file storage. Rather than managing blocks and files underneath, Ceph manages objects and supports block-and-file-based storage on top of it. Objects provide enormous scaling with increased performance by eliminating metadata operations. Ceph uses an algorithm to dynamically compute where the object should be stored and retrieved from.
