E-Book
39,59 €

Ceph Cookbook E-Book

Karan Singh

0,0

39,59 €

Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.

Herausgeber: Packt Publishing
Kategorie: Wissenschaft und neue Technologien
Sprache: Englisch

Beschreibung

Over 100 effective recipes to help you design, implement, and manage the software-defined and massively scalable Ceph storage system

About This Book

Implement a Ceph cluster successfully and gain deep insights into its best practices
Harness the abilities of experienced storage administrators and architects, and run your own software-defined storage system
This comprehensive, step-by-step guide will show you how to build and manage Ceph storage in production environment

Who This Book Is For

This book is aimed at storage and cloud system engineers, system administrators, and technical architects who are interested in building software-defined storage solutions to power their cloud and virtual infrastructure. If you have basic knowledge of GNU/Linux and storage systems, with no experience of software defined storage solutions and Ceph, but eager to learn this book is for you.

What You Will Learn

Understand, install, configure, and manage the Ceph storage system
Get to grips with performance tuning and benchmarking, and gain practical tips to run Ceph in production
Integrate Ceph with OpenStack Cinder, Glance, and nova components
Deep dive into Ceph object storage, including s3, swift, and keystone integration
Build a Dropbox-like file sync and share service and Ceph federated gateway setup
Gain hands-on experience with Calamari and VSM for cluster monitoring
Familiarize yourself with Ceph operations such as maintenance, monitoring, and troubleshooting
Understand advanced topics including erasure coding, CRUSH map, cache pool, and system maintenance

In Detail

Ceph is a unified, distributed storage system designed for excellent performance, reliability, and scalability. This cutting-edge technology has been transforming the storage industry, and is evolving rapidly as a leader in software-defined storage space, extending full support to cloud platforms such as Openstack and Cloudstack, including virtualization platforms. It is the most popular storage backend for Openstack, public, and private clouds, so is the first choice for a storage solution. Ceph is backed by RedHat and is developed by a thriving open source community of individual developers as well as several companies across the globe.

This book takes you from a basic knowledge of Ceph to an expert understanding of the most advanced features, walking you through building up a production-grade Ceph storage cluster and helping you develop all the skills you need to plan, deploy, and effectively manage your Ceph cluster. Beginning with the basics, you'll create a Ceph cluster, followed by block, object, and file storage provisioning. Next, you'll get a step-by-step tutorial on integrating it with OpenStack and building a Dropbox-like object storage solution. We'll also take a look at federated architecture and CephFS, and you'll dive into Calamari and VSM for monitoring the Ceph environment. You'll develop expert knowledge on troubleshooting and benchmarking your Ceph storage cluster. Finally, you'll get to grips with the best practices to operate Ceph in a production environment.

Style and approach

This step-by-step guide is filled with practical tutorials, making complex scenarios easy to understand.

Details

Sie lesen das E-Book in den Legimi-Apps auf:

Android

iOS

von Legimi
zertifizierten E-Readern

Seitenzahl: 333

Veröffentlichungsjahr: 2016

Bewertungen

0,0

Rezensionen(0 Rezensionen)

Leseprobe

Ceph Cookbook

Credits

Foreword

About the Author

About the Reviewers

www.PacktPub.com

eBooks, discount offers, and more

Why Subscribe?

Preface

What this book covers

What you need for this book

Who this book is for

Sections

Getting ready

How to do it…

How it works…

There's more…

See also…

Accessing Ceph object storage using S3 API

How to do it…

Configuring DNS

Configuring the s3cmd client

Accessing Ceph object storage using the Swift API

How to do it

See also…

Integrating RADOS Gateway with OpenStack Keystone

How to do it…

Configuring Ceph federated gateways

How to do it…

Testing the radosgw federated configuration

How to do it…

Building file sync and share service using RGW

Getting ready…

How to do it…

See also…

4. Working with the Ceph Filesystem

Introduction

Understanding Ceph Filesystem and MDS

Deploying Ceph MDS

How to do it…

Accessing CephFS via kernel driver

How to do it…

Accessing CephFS via FUSE client

How to do it…

Exporting Ceph Filesystem as NFS

How to do it…

ceph-dokan – CephFS for Windows clients

How to do it…

CephFS a drop-in replacement for HDFS

5. Monitoring Ceph Clusters using Calamari

Introduction

Ceph cluster monitoring – the classic way

Monitoring Ceph clusters

How to do it…

Checking the cluster's health

Monitoring cluster events

The cluster utilization statistics

Checking the cluster's status

The cluster authentication entries

Monitoring Ceph MON

How to do it…

Checking the MON status

Checking the MON quorum status

Monitoring Ceph OSDs

How to do it…

OSD tree view

OSD statistics

Checking the crush map

Monitoring PGs

Monitoring Ceph MDS

How to do it…

Introducing Ceph Calamari

Building Calamari server packages

How to do it…

Building Calamari client packages

How to do it…

Setting up Calamari master server

How to do it…

Adding Ceph nodes to Calamari

How to do it…

Monitoring Ceph clusters from the Calamari dashboard

Troubleshooting Calamari

How to do it…

6. Operating and Managing a Ceph Cluster

Introduction

Understanding Ceph service management

Managing the cluster configuration file

How to do it…

Adding monitor nodes to the Ceph configuration file

Adding an MDS node to the Ceph configuration file

Adding OSD nodes to the Ceph configuration file

Running Ceph with SYSVINIT

Starting and stopping all daemons

How to do it…

Starting and stopping all daemons by type

How to do it…

Starting daemons by type

Stopping daemons by type

Starting and stopping a specific daemon

How to do it…

Starting a specific daemon

Stopping a specific daemon

Running Ceph as a service

Starting and stopping all daemons

How to do it…

Starting and stopping all daemons by type

How to do it…

Starting daemons by type

Stopping daemons by type

Starting and stopping a specific daemon

How to do it…

Starting a specific daemon

Stopping a specific daemon

Scale-up versus scale-out

Scaling out your Ceph cluster

Adding the Ceph OSD

How to do it…

Adding the Ceph MON

How to do it...

Adding the Ceph RGW

Scaling down your Ceph cluster

Removing the Ceph OSD

How to do it…

Removing Ceph MON

How to do it…

Replacing a failed disk in the Ceph cluster

How to do it…

Upgrading your Ceph cluster

How to do it…

Maintaining a Ceph cluster

How to do it…

How it works…

7. Ceph under the Hood

Introduction

Ceph scalability and high availability

Understanding the CRUSH mechanism

CRUSH map internals

How to do it…

How it works…

Ceph cluster map

High availability monitors

Ceph authentication and authorization

Ceph authentication

Ceph authorization

How to do it…

Ceph dynamic cluster management

Ceph placement group

How to do it…

Placement group states

Creating Ceph pools on specific OSDs

How to do it…

8. Production Planning and Performance Tuning for Ceph

Introduction

The dynamics of capacity, performance, and cost

Choosing the hardware and software components for Ceph

Processor

Memory

Network

Disk

Ceph OSD Journal partition

Ceph OSD Data partition

Operating System

OSD Filesystem

Ceph recommendation and performance tuning

Global cluster tuning

Monitor tuning

OSD tuning

OSD General Settings

OSD Journal settings

OSD Filestore settings

OSD Recovery settings

OSD Backfilling settings

OSD scrubbing settings

Client tuning

Operating System tuning

Ceph erasure coding

Erasure code plugin

Creating an erasure coded pool

How to do it…

Ceph cache tiering

Writeback mode

Read-only mode

Creating a pool for cache tiering

How to do it…

See also…

Creating a cache tier

How to do it…

Configuring a cache tier

How to do it…

Testing a cache tier

How to do it…

9. The Virtual Storage Manager for Ceph

Introduction

Understanding the VSM architecture

The VSM Controller

The VSM Agent

Setting up the VSM environment

How to do it…

Getting ready for VSM

How to do it…

Installing VSM

How to do it…

Creating a Ceph cluster using VSM

How to do it…

Exploring the VSM dashboard

Upgrading the Ceph cluster using VSM

How to do it…

VSM roadmap

VSM resources

10. More on Ceph

Introduction

Benchmarking the Ceph cluster

Disk performance baseline

Single disk write performance

How to do it…

Multiple disk write performance

How to do it…

Single disk read performance

How to do it…

Multiple disk read performance

How to do it…

Results

Baseline network performance

How to do it…

See also…

Ceph RADOS bench

How to do it…

How it works…

RADOS load-gen

How to do it…

How it works…

There's more…

Benchmarking the Ceph block device

Ceph rbd bench-write

How to do it…

How it works…

There's more…

See also…

Benchmarking Ceph RBD using FIO

How to do it…

See also…

Ceph admin socket

How to do it…

Using the ceph tell command

How to do it…

How it works…

Ceph REST API

How to do it…

Profiling Ceph memory

How to do it…

Deploying Ceph using Ansible

Getting ready

How to do it…

There's more…

The ceph-objectstore tool

How to do it…

How it works…

Index

Ceph Cookbook

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: February 2016

Production reference: 1250216

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78439-350-2

www.packtpub.com

Credits

Author

Karan Singh

Reviewers

Christian Eichelmann

Haruka Iwao

Commissioning Editor

Amarabha Banerjee

Acquisition Editor

Meeta Rajani

Content Development Editor

Kajal Thapar

Technical Editor

Menza Mathew

Copy Editor

Angad Singh

Project Coordinator

Shweta H Birwatkar

Proofreader

Safis Editing

Indexer

Rekha Nair

Production Coordinator

Melwyn Dsa

Cover Work

Melwyn Dsa

Foreword

One year ago, Karan published his first book, Learning Ceph, Packt Publishing, which has been a great success. It addressed a need that a lot of users had: an easy-to-understand introduction to Ceph and an overview of its architecture.

When an open source project has an enthusiastic community like Ceph does, the innovation and evolution of features happen at a rapid pace. Besides the core development team around Sage Weil at Red Hat, industry heavyweights such as Intel, SanDisk, Fujitsu, and Suse, as well as countless other individuals, have made substantial contributions. As a result, the project continues to mature both in capability and stability; the latter playing a key role in enterprise deployments. Many features and components that are now a part of Ceph were only in their infancy when Learning Ceph, Packt Publishing, came out; erasure encoding, optimized performance for SSDs, and the Virtual Storage Manager (VSM) are just a couple of examples. All of these are covered in great detail in this new book that you are holding in your hands right now.

The other day, I read a blog where the author likened the significance of Ceph to the storage industry to the impact that Linux had on operating systems. While it is still too early to make that call, its adoption in the industry speaks for itself, with multi-petabyte-sized deployments becoming more and more common. Large-scale users such as CERN and Yahoo are regularly sharing their experiences with the community.

The wealth of capabilities and the enormous flexibility to adapt to a wide range of use cases can sometimes make it difficult to approach this new technology, and it can leave new users wondering where to start their learning journeys. Not everybody has access to massive data centers with thousands of servers and disks to experiment and build their own experiences. Karan's new book, Ceph Cookbook, Packt Publishing, is meant to help by providing practical, hands-on advice for the many challenges you will encounter.

As a long-time Ceph enthusiast, I have worked with Karan for several years and congratulate him on his passion and initiative to compile a comprehensive guide for first-time users of Ceph. It will be a useful guide to those embarking on deploying the open source community version of Ceph.

This book complements the more technical documentation and collateral developed by members of the Ceph community, filling in the gaps with useful commentary and advice for new users.

If you are downloading the Ceph community version, kicking its tires, and trying it out at home or on your non-mission-critical workloads in the enterprise, this book is for you. Expect to learn to deploy and manage Ceph step by step along with tips and use cases for deploying Ceph's features and functionality on certain storage workloads.

Now, it's time to begin reading about the ingredients you'll need to cook up your own Ceph software-defined storage deployment. But hurry— the new exciting features, such as production-ready CephFS and support for containers, are already in the pipeline, and I am looking forward to seeing Karan's next book in another year from now.

Dr. Wolfgang Schulze

Director of Global Storage Consulting, Red Hat

About the Author

Karan Singh is an IT expert and tech evangelist, living with his beautiful wife, Monika, in Finland. He holds a bachelor's degree, with honors, in computer science, and a master's degree in system engineering from BITS, Pilani. Apart from this, he is a certified professional for technologies such as OpenStack, NetApp, Oracle Solaris, and Linux.

Karan is currently working as a System Specialist of Storage and Cloud, for CSC – IT Center for Science Ltd., focusing all his energies on developing IaaS cloud solutions based on OpenStack and Ceph, and building economic multi-petabyte storage systems using Ceph.

Karan possesses a rich skill set and has strong work experience in a variety of storage solutions, cloud technologies, automation tools and Unix systems. He is also the author of the very first book on Ceph, titled Learning Ceph, published in 2015.

Karan devotes a part of his time to R&D and learning new technologies. When not working on Ceph and OpenStack, Karan can be found working with emerging technologies or automating stuffs. He loves writing about technologies and is an avid blogger at www.ksingh.co.in. You can reach him on Twitter @karansingh010, or by e-mail at <[email protected]>.

I'd like to thank my wife, Monika, for preparing delicious food while I was writing this book. Kiitos MJ, you are a great chef, Minä rakastan sinua.

I would like to take this opportunity to thank my company, CSC – IT Center for Science Ltd., and all my colleagues with whom I have worked and made memories. CSC, you are an amazing place to work, kiitos.

I'd also like to express my thanks to the vibrant Ceph community and its ecosystem for developing, improving, and supporting Ceph.

Finally, my sincere thanks to the entire Packt Publishing team, and also to the technical reviewers, for their state-of-the-art work during the course of this project.

About the Reviewers

Christian Eichelmann has worked as a System Engineer and an IT architect in Germany for several years, in a lot of different companies. He has been using Ceph since its early alpha releases and is currently running several Petabyte-scale clusters. He also developed ceph-dash: a popular monitoring dashboard for Ceph.

Haruka Iwao is an ads solutions engineer with Google. She worked as a storage solutions architect at Red Hat and has contributed to the Ceph community, especially in Japan. She also has work experience as a site reliability engineer at a few start-ups in Tokyo, and she is interested in site reliability engineering and large-scale computing. She studied distributed filesystems in her master's course at the University of Tsukuba.

www.PacktPub.com

eBooks, discount offers, and more

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why Subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Preface

We are a part of a digital world that is producing an enormous amount of data each second. The data growth is unimaginable and it's predicted that humankind will possess 40 Zettabytes of data by 2020. Well that's not too much, but how about 2050? Should we guesstimate a Yottabyte? The obvious question arises: do we have any way to store this gigantic data, or are we prepared for the future? To me, Ceph is the ray of hope and the technology that can be a possible answer to the data storage needs of the next decade. Ceph is the future of storage.

It's a great saying that "Software is eating the world". Well that's true. However, from another angle, software is the feasible way to go for various computing needs, such as computing weather, networking, storage, datacenters, and burgers, ummm…well, not burgers currently. As you already know, the idea behind a software-defined solution is to build all the intelligence in software itself and use commodity hardware to solve your greatest problem. And I think, this software-defined approach should be the answer to the future's computing problems.

Ceph is a true open source, software-defined storage solution, purposely built to handle unprecedented data growth with linear performance improvement. It provides a unified storage experience for file, object, and block storage interfaces from the same system. The beauty of Ceph is its distributed, scalable nature, and performance; reliability and robustness come along with these attributes. And furthermore, it is pocket friendly, that is, economical, providing you more value for each dollar you spent.

Ceph is the next big thing that has happened to the storage industry. Its enterprise class features such as scalability, reliability, erasure coding, cache tiering and counting, has led to its maturity that has improved significantly in the last few years. To name a few, there are organizations such as CERN, Yahoo, and DreamHost where multi-PB Ceph cluster is being deployed and is running successfully.

It's been a while since block and object interfaces of Ceph have been introduced and they are now fully developed. Until last year, CephFS was the only component that was lacking production readiness. This year, my bet is on CephFS as it's going to be production-ready in Ceph Jewel. I can't wait to see CephFS production adoption stories. There are a few more areas where Ceph is gaining popularity, such as AFA (All Flash Array), database workloads, storage for containers, and Hyper Converge Infrastructure. Well, Ceph has just begun; the best is yet to come.

In this book, we will take a deep dive to understand Ceph—covering components and architecture including its working. The Ceph Cookbook focuses on hands-on knowledge by providing you with step-by-step guidance with the help of recipes. Right from the first chapter, you will gain practical experience of Ceph by following the recipes. With each chapter, you will learn and play around with interesting concepts of Ceph. I hope, by the end of this book, you will feel competent regarding Ceph, both conceptually as well as practically, and you will be able to operate your Ceph storage infrastructure with confidence and success.

Happy Learning

Karan Singh

What this book covers

Chapter 1, Ceph – Introduction and Beyond, covers an introduction to Ceph, gradually moving towards RAID and its challenges, and Ceph architectural overview. Finally, we will go through Ceph installation and configuration.

Chapter 2, Working with Ceph Block Device, covers an introduction to Ceph Block Device and provisioning of the Ceph block device. We will also go through RBD snapshots, clones, as well as storage options for OpenStack cinder, glance and nova.

Chapter 3, Working with Ceph Object Storage, deep dives into Ceph object storage including RGW standard and federated setup, S3, and OpenStack Swift access. Finally, we will set up file sync and service using ownCloud.

Chapter 4, Working with the Ceph Filesystem, covers an introduction to CephFS, deploying and accessing MDS and CephFS via kernel, Fuse, and NFS-Ganesha. You will also learn how to access CephFS via Ceph-Dokan Windows client.

Chapter 5, Monitoring Ceph Clusters using Calamari, includes Ceph monitoring via CLI, an introduction to Calamari, and setting up of Calamari server and clients. We will also cover monitoring of Ceph cluster via Calamari GUI as well as troubleshooting Calamari.

Chapter 6, Operating and Managing a Ceph Cluster, covers Ceph service management and scaling up and scaling down a Ceph cluster. This chapter also includes failed disk replacement and upgrading Ceph infrastructure.

Chapter 7, Ceph under the Hood, explores Ceph CRUSH map, understanding the internals of CRUSH map, followed by Ceph authentication and authorization. This chapter also covers dynamic cluster management and the understanding of Ceph PG. Finally, we created the specifics required for specific hardware.

Chapter 8, Production Planning and Performance Tuning for Ceph, covers the planning of Cluster production deployment and HW and SW planning for Ceph. This chapter also includes Ceph recommendation and performance tuning. Finally, this chapter covers erasure coding and cache tiering.

Chapter 9, The Virtual Storage Manager for Ceph, is dedicated to Virtual Storage Manager (VSM), covering its introduction and architecture. We will also go through the deployment of VSM and then the creation of a Ceph cluster using VSM and manage it.

Chapter 10, More on Ceph, the final chapter of the book, covers Ceph benchmarking, Ceph troubleshooting using admin socket, API, and the ceph-objectstore tool. This chapter also covers the deployment of Ceph using Ansible and Ceph memory profiling.

What you need for this book

The various software components required to follow the instructions in the chapters are as follows:

VirtualBox 4.0 or higher (https://www.virtualbox.org/wiki/Downloads)GIT (http://www.git-scm.com/downloads)Vagrant 1.5.0 or higher (https://www.vagrantup.com/downloads.html)CentOS operating system 7.0 or higher (http://wiki.centos.org/Download)Ceph software packages Version 0.87.0 or higher (http://ceph.com/resources/downloads/)S3 Client, typically S3cmd (http://s3tools.org/download)Python-swift clientownCloud 7.0.5 or higher (https://download.owncloud.org/download/repositories/stable/owncloud/)NFS GaneshaCeph FuseCeph-DokanCeph-Calamari (https://github.com/ceph/calamari.git)Diamond (https://github.com/ceph/Diamond.git)Ceph Calamari Client, romana (https://github.com/ceph/romana)Virtual Storage Manager 2.0 or higher (https://github.com/01org/virtual-storage-manager/releases/tag/v2.1.0)Ansible 1.9 or higher (http://docs.ansible.com/ansible/intro_installation.html)OpenStack RDO (http://rdo.fedorapeople.org/rdo-release.rpm)

Who this book is for

This book is aimed at storage and cloud system engineers, system administrators, and technical architects and consultants who are interested in building software-defined storage solutions around Ceph to power their cloud and virtual infrastructure. If you have a basic knowledge of GNU/Linux and storage systems, with no experience of software-defined storage solutions and Ceph, but are eager to learn, this book is for you.

Sections

In this book, you will find several headings that appear frequently (Getting ready, How to do it, How it works, There's more, and See also).

To give clear instructions on how to complete a recipe, we use these sections as follows:

Getting ready

This section tells you what to expect in the recipe, and describes how to set up any software or any preliminary settings required for the recipe.

How to do it…

This section contains the steps required to follow the recipe.

How it works…

This section usually consists of a detailed explanation of what happened in the previous section.

There's more…

This section consists of additional information about the recipe in order to make the reader more knowledgeable about the recipe.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "To do this, we need to edit /etc/nova/nova.conf on the OpenStack node and add the following perform the steps that are given in the following section."

A block of code is set as follows:

inject_partition=-2 images_type=rbd images_rbd_pool=vms images_rbd_ceph_conf=/etc/ceph/ceph.conf

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

inject_partition=-2 images_type=rbd images_rbd_pool=vms images_rbd_ceph_conf=/etc/ceph/ceph.conf

Any command-line input or output is written as follows:

# rados -p cache-pool ls

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Navigate to the Options defined in nova.virt.libvirt.volume section and add the following lines of code:"

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Chapter 1. Ceph – Introduction and Beyond

In this chapter, we will cover the following recipes:

Ceph – the beginning of a new eraRAID – the end of an eraCeph – the architectural overviewPlanning the Ceph deploymentSetting up a virtual infrastructureInstalling and configuring CephScaling up your Ceph clusterUsing Ceph clusters with a hands-on approach

Introduction

Ceph is currently the hottestSoftware Defined Storage (SDS) technology that is shaking up the entire storage industry. It is an open source project that provides unified software defined solutions for Block, File, and Object storage. The core idea of Ceph is to provide a distributed storage system that is massively scalable and high performing with no single point of failure. From the roots, it has been designed to be highly scalable (up to the exabyte level and beyond) while running on general-purpose commodity hardware.

Ceph is acquiring most of the traction in the storage industry due to its open, scalable, and reliable nature. This is the era of cloud computing and software defined infrastructure, where we need a storage backend that is purely software defined, and more importantly, cloud ready. Ceph fits in here very well, regardless of whether you are running a public, private, or hybrid cloud.

Today's software systems are very smart and make the best use of commodity hardware to run gigantic scale infrastructure. Ceph is one of them; it intelligently uses commodity hardware to provide enterprise-grade robust and highly reliable storage systems.

Ceph has been raised and nourished with an architectural philosophy that includes the following:

Every component must scale linearlyThere should not be any single point of failureThe solution must be software-based, open source, and adaptableCeph software should run on readily available commodity hardwareEvery component must be self-managing and self-healing wherever possible

The foundation of Ceph lies in the objects, which are its building blocks, and object storage like Ceph is the perfect provision for the current and future needs for unstructured data storage. Object storage has its advantages over traditional storage solutions; we can achieve platform and hardware independence using object storage. Ceph plays meticulously with objects and replicates them across the cluster to avail reliability; in Ceph, objects are not tied to a physical path, making object location independent. Such flexibility enables Ceph to scale linearly from the petabyte to exabyte level.

Ceph provides great performance, enormous scalability, power, and flexibility to organizations. It helps them get rid of expensive proprietary storage silos. Ceph is indeed an enterprise class storage solution that runs on commodity hardware; it is a low-cost yet feature rich storage system. The Ceph universal storage system provides Block, File, and Object storage under one hood, enabling customers to use storage as they want.

Ceph Releases

Ceph is being developed and improved at a rapid pace. On July 3, 2012, Sage announced the first LTS release of Ceph with the code name, Argonaut. Since then, we have seen seven new releases come up. Ceph releases are categorized as LTS (Long Term Support), and development releases and every alternate Ceph release is an LTS release. For more information, please visit https://Ceph.com/category/releases/.

Ceph release name

Ceph release version

Released On

Argonaut

V0.48 (LTS)

July 3, 2012

Bobtail

V0.56 (LTS)

January 1, 2013

Cuttlefish

V0.61

May 7, 2013

Dumpling

V0.67 (LTS)

August 14, 2013

Emperor

V0.72

November 9, 2013

Firefly

V0.80 (LTS)

May 7, 2014

Giant

V0.87.1

Feb 26, 2015

Hammer

V0.94 (LTS)

April 7, 2015

Infernalis

V9.0.0

May 5, 2015

Jewel

V10.0.0

Nov, 2015

Tip

Here is a fact: Ceph release names follow alphabetic order; the next one will be a "K" release.

Note

The term "Ceph" is a common nickname given to pet octopuses and is considered a short form of "Cephalopod", which is a class of marine animals that belong to the mollusk phylum. Ceph has octopuses as its mascot, which represents Ceph's highly parallel behavior, similar to octopuses.

Ceph – the beginning of a new era

Data storage requirements have grown explosively over the last few years. Research shows that data in large organizations is growing at a rate of 40 to 60 percent annually, and many companies are doubling their data footprint each year. IDC analysts estimated that worldwide, there were 54.4 exabytes of total digital data in the year 2000. By 2007, this reached 295 exabytes, and by 2020, it's expected to reach 44 zettabytes worldwide. Such data growth cannot be managed by traditional storage systems; we need a system like Ceph, which is distributed, scalable and most importantly, economically viable. Ceph has been designed especially to handle today's as well as the future's data storage needs.

Software Defined Storage (SDS)

SDS is what is needed to reduce TCO for your storage infrastructure. In addition to reduced storage cost, an SDS can offer flexibility, scalability, and reliability. Ceph is a true SDS solution; it runs on commodity hardware with no vendor lock-in and provides low cost per GB. Unlike traditional storage systems where hardware gets married to software, in SDS, you are free to choose commodity hardware from any manufacturer and are free to design a heterogeneous hardware solution for your own needs. Ceph's software-defined storage on top of this hardware provides all the intelligence you need and will take care of everything, providing all the enterprise storage features right from the software layer.

Cloud storage

One of the drawbacks of a cloud infrastructure is the storage. Every cloud infrastructure needs a storage system that is reliable, low-cost, and scalable with a tighter integration than its other cloud components. There are many traditional storage solutions out there in the market that claim to be cloud ready, but today we not only need cloud readiness, but a lot more beyond that. We need a storage system that should be fully integrated with cloud systems and can provide lower TCO without any compromise to reliability and scalability. The cloud systems are software defined and are built on top of commodity hardware; similarly, it needs a storage system that follows the same methodology, that is, being software defined on top of commodity hardware, and Ceph is the best choice available for cloud use cases.

Ceph has been rapidly evolving and bridging the gap of a true cloud storage backend. It is grabbing center stage with every major open source cloud platform, namely OpenStack, CloudStack, and OpenNebula. Moreover, Ceph has succeeded in building up beneficial partnerships with cloud vendors such as Red Hat, Canonical, Mirantis, SUSE, and many more. These companies are favoring Ceph big time and including it as an official storage backend for their cloud OpenStack distributions, thus making Ceph a red hot technology in cloud storage space.

The OpenStack project is one of the finest examples of open source software powering public and private clouds. It has proven itself as an end-to-end open source cloud solution. OpenStack is a collection of programs, such as cinder, glance, and swift, which provide storage capabilities to OpenStack. These OpenStack components required a reliable, scalable, and all in one storage backend like Ceph. For this reason, Openstack and Ceph communities have been working together for many years to develop a fully compatible Ceph storage backend for the OpenStack.

Cloud infrastructure based on Ceph provides much needed flexibility to service providers to build Storage-as-a-Service and Infrastructure-as-a-Service solutions, which they cannot achieve from other traditional enterprise storage solutions as they are not designed to fulfill cloud needs. By using Ceph, service providers can offer low-cost, reliable cloud storage to their customers.

Unified next generation storage architecture

The definition of unified storage has changed lately. A few years ago, the term "unified storage" referred to providing file and block storage from a single system. Now, because of recent technological advancements, such as cloud computing, big data, and Internet of Things, a new kind of storage has been evolving, that is, object storage. Thus, all the storage systems that do not support object storage are not really unified storage solutions. A true unified storage is like Ceph; it supports blocks, files, and object storage from a single system.

In Ceph, the term "unified storage" is more meaningful than what existing storage vendors claim to provide. Ceph has been designed from the ground up to be future ready, and it's constructed such that it can handle enormous amounts of data. When we call Ceph "future ready", we mean to focus on its object storage capabilities, which is a better fit for today's mix of unstructured data rather than blocks or files. Everything in Ceph relies on intelligent objects, whether it's block storage or file storage. Rather than managing blocks and files underneath, Ceph manages objects and supports block-and-file-based storage on top of it. Objects provide enormous scaling with increased performance by eliminating metadata operations. Ceph uses an algorithm to dynamically compute where the object should be stored and retrieved from.

The traditional storage architecture of a SAN and NAS system is very limited. Basically, they follow the tradition of controller high availability, that is, if one storage controller fails it serves data from the second controller. But, what if the second controller fails at the same time, or even worse, if the entire disk shelf fails? In most cases, you will end up losing your data. This kind of storage architecture, which cannot sustain multiple failures, is definitely what we do not want today. Another drawback of traditional storage systems is its data storage and access mechanism. It maintains a central lookup table to keep track of metadata, which means that every time a client sends a request for a read or write operation, the storage system first performs a lookup in the huge metadata table, and after receiving the real data location, it performs client operation. For a smaller storage system, you might not notice performance hits, but think of a large storage cluster—you would definitely be bound by performance limits with this approach. This would even restrict your scalability.

Ceph does not follow such traditional storage architecture; in fact, the architecture has been completely reinvented. Rather than storing and manipulating metadata, Ceph introduces a newer way: the CRUSH algorithm. CRUSH stands for Controlled Replication Under Scalable Hashing. Instead of performing lookup in the metadata table for every client request, the CRUSH algorithm computes on demand where the data should be written to or read from. By computing metadata, the need to manage a centralized table for metadata is no longer there. The modern computers are amazingly fast and can perform a CRUSH lookup very quickly; moreover, this computing load, which is generally not too much, can be distributed across cluster nodes, leveraging the power of distributed storage. In addition to this, CRUSH has a unique property, which is infrastructure awareness. It understands the relationship between various components of your infrastructure and stores your data in a unique failure zone, such as a disk, node, rack, row, and datacenter room, among others. CRUSH stores all the copies of your data such that it is available even if a few components fail in a failure zone. It is due to CRUSH that Ceph can handle multiple component failures and provide reliability and durability.

The CRUSH algorithm makes Ceph self-managing and self-healing. In an event of component failure in a failure zone, CRUSH senses which component has failed and determines the effect on the cluster. Without any administrative intervention, CRUSH self-manages and self-heals by performing a recovering operation for the data lost due to failure. CRUSH regenerates the data from the replica copies that the cluster maintains. If you have configured the Ceph CRUSH map in the correct order, it makes sure that at least one copy of your data is always accessible. Using CRUSH, we can design a highly reliable storage infrastructure with no single point of failure. It makes Ceph a highly scalable and reliable storage system that is future ready.

RAID – the end of an era

RAID technology has been the fundamental building block for storage systems for years. It has proven successful for almost every kind of data that has been generated in the last 3 decades. But all eras must come

Tausende von E-Books und Hörbücher

Ihre Zahl wächst ständig und Sie haben eine Fixpreisgarantie.

Sie haben über uns geschrieben:

Ceph Cookbook E-Book

Karan Singh

About This Book

Who This Book Is For

What You Will Learn

In Detail

Style and approach

Table of Contents

Ceph Cookbook

Ceph Cookbook

Credits

Foreword

About the Author

About the Reviewers

www.PacktPub.com

eBooks, discount offers, and more

Why Subscribe?

Preface

What this book covers

What you need for this book

Who this book is for

Sections

Getting ready

How to do it…

How it works…

There's more…

See also

Conventions

Note

Tip

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

Chapter 1. Ceph – Introduction and Beyond

Introduction

Ceph Releases

Tip

Note

Ceph – the beginning of a new era

Software Defined Storage (SDS)

Cloud storage

Unified next generation storage architecture

RAID – the end of an era