41,99 €
Get to grips with the unified, highly scalable distributed storage system and learn how to design and implement it.
This Learning Path takes you through the basics of Ceph all the way to gaining in-depth understanding of its advanced features. You’ll gather skills to plan, deploy, and manage your Ceph cluster. After an introduction to the Ceph architecture and its core projects, you’ll be able to set up a Ceph cluster and learn how to monitor its health, improve its performance, and troubleshoot any issues. By following the step-by-step approach of this Learning Path, you’ll learn how Ceph integrates with OpenStack, Glance, Manila, Swift, and Cinder. With knowledge of federated architecture and CephFS, you’ll use Calamari and VSM to monitor the Ceph environment. In the upcoming chapters, you’ll study the key areas of Ceph, including BlueStore, erasure coding, and cache tiering. More specifically, you’ll discover what they can do for your storage system. In the concluding chapters, you will develop applications that use Librados and distributed computations with shared object classes, and see how Ceph and its supporting infrastructure can be optimized.
By the end of this Learning Path, you'll have the practical knowledge of operating Ceph in a production environment.
This Learning Path includes content from the following Packt products:
If you are a developer, system administrator, storage professional, or cloud engineer who wants to understand how to deploy a Ceph cluster, this Learning Path is ideal for you. It will help you discover ways in which Ceph features can solve your data storage problems. Basic knowledge of storage systems and GNU/Linux will be beneficial.
Michael Hackett is a storage and SAN expert in customer support. He has been working on Ceph and storage-related products for over 12 years. Michael is currently working at Red Hat, based in Massachusetts, where he is a principal software maintenance engineer for Red Hat Ceph and the technical product lead for the global Ceph team. Vikhyat Umrao has been working on software-defined storage technology, with specific expertise in Ceph Unified Storage. He has been working on Ceph for over 3 years now and in his current position at Red Hat, he focuses on the support and development of Ceph to solve Red Hat Ceph storage customer issues and upstream reported issues. Karan Singh is a certified professional for technologies like OpenStack, NetApp and Oracle Solaris. He is currently working as a System Specialist of Storage and Cloud Platform for CSC - IT Center for Science Ltd, focusing all his energies on providing IaaS cloud solutions based on OpenStack and Ceph and building economic multi-petabyte storage system using Ceph. Nick Fisk is an IT specialist with a strong history in enterprise storage. Throughout his career, he worked in a variety of roles and mastered several technologies. Over the years, he has deployed several clusters. He spends time in the Ceph community, helping others and improving certain areas of Ceph. Anthony D'Atri's career in system administration spans from laptops to vector supercomputers. He has brought his passion for fleet management and server components to bear on a holistic yet, detailed approach to deployment and operations. Anthony worked for three years at Cisco using Ceph as a petabyte-scale object and block backend to multiple OpenStack clouds. Vaibhav Bhembre is a systems programmer working currently as a technical lead for cloud storage products at DigitalOcean. From helping to scale dynamically generated campaign sends to over million users at a time, to architecting a cloud-scale compute and storage platform, Vaibhav has years of experience writing software across all layers of the stack.Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 569
Veröffentlichungsjahr: 2019
Copyright © 2019 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors nor Packt Publishing or its dealers and distributors will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First Published: January 2019
Production Reference: 1290119
Published by Packt Publishing Ltd. Livery Place, 35 Livery Street Birmingham, B3 2PB, U.K.
ISBN 978-1-78829-541-3
www.packtpub.com
Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry-leading tools to help you plan your personal development and advance your career. For more information, please visit our website.
Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals
Improve your learning with Skill Plans built especially for you
Get a free eBook or video every month
Mapt is fully searchable
Copy and paste, print, and bookmark content
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.packt.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at [email protected] for more details.
At www.packt.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks.
Michael Hackett is a storage and SAN expert in customer support. He has been working on Ceph and storage-related products for over 12 years. Michael is currently working at Red Hat, based in Massachusetts, where he is a principal software maintenance engineer for Red Hat Ceph and the technical product lead for the global Ceph team.
Vikhyat Umrao has been working on software-defined storage technology, with specific expertise in Ceph Unified Storage. He has been working on Ceph for over 3 years now and in his current position at Red Hat, he focuses on the support and development of Ceph to solve Red Hat Ceph storage customer issues and upstream reported issues.
Karan Singhis a certified professional for technologies like OpenStack, NetApp and Oracle Solaris. He is currently working as a System Specialist of Storage and Cloud Platform for CSC - IT Center for Science Ltd, focusing all his energies on providing IaaS cloud solutions based on OpenStack and Ceph and building economic multi-petabyte storage system using Ceph.
Nick Fisk is an IT specialist with a strong history in enterprise storage. Throughout his career, he worked in a variety of roles and mastered several technologies. Over the years, he has deployed several clusters. He spends time in the Ceph community, helping others and improving certain areas of Ceph.
Anthony D'Atri's career in system administration spans from laptops to vector supercomputers. He has brought his passion for fleet management and server components to bear on a holistic yet, detailed approach to deployment and operations. Anthony worked for three years at Cisco using Ceph as a petabyte-scale object and block backend to multiple OpenStack clouds.
Vaibhav Bhembre is a systems programmer working currently as a technical lead for cloud storage products at DigitalOcean. From helping to scale dynamically generated campaign sends to over million users at a time, to architecting a cloud-scale compute and storage platform, Vaibhav has years of experience writing software across all layers of the stack.
If you're interested in becoming an author for Packt, please visit authors.packtpub.com and apply today. We have worked with thousands of developers and tech professionals, just like you, to help them share their insight with the global tech community. You can make a general application, apply for a specific hot topic that we are recruiting an author for, or submit your own idea.
Title Page
Copyright
Ceph: Designing and Implementing Scalable Storage Systems
About Packt
Why Subscribe?
Packt.com
Contributors
About the Authors
Packt Is Searching for Authors Like You
Preface
Who This Book Is For
What This Book Covers
To Get the Most out of This Book
Download the Example Code Files
Conventions Used
Get in Touch
Reviews
Ceph - Introduction and Beyond
Introduction
Ceph – the beginning of a new era
Software-defined storage – SDS
Cloud storage
Unified next-generation storage architecture
RAID – the end of an era
RAID rebuilds are painful
RAID spare disks increases TCO
RAID can be expensive and hardware dependent
The growing RAID group is a challenge
The RAID reliability model is no longer promising
Ceph – the architectural overview
Planning a Ceph deployment
Setting up a virtual infrastructure
Getting ready
How to do it...
Installing and configuring Ceph
Creating the Ceph cluster on ceph-node1
How to do it...
Scaling up your Ceph cluster
How to do it…
Using the Ceph cluster with a hands-on approach
How to do it...
Working with Ceph Block Device
Introduction
Configuring Ceph client
How to do it...
Creating Ceph Block Device
How to do it...
Mapping Ceph Block Device
How to do it...
Resizing Ceph RBD
How to do it...
Working with RBD snapshots
How to do it...
Working with RBD clones
How to do it...
Disaster recovery replication using RBD mirroring
How to do it...
Configuring pools for RBD mirroring with one way replication
How to do it...
Configuring image mirroring
How to do it...
Configuring two-way mirroring
How to do it...
See also
Recovering from a disaster!
How to do it...
Working with Ceph and OpenStack
Introduction
Ceph – the best match for OpenStack
Setting up OpenStack
How to do it...
Configuring OpenStack as Ceph clients
How to do it...
Configuring Glance for Ceph backend
How to do it…
Configuring Cinder for Ceph backend
How to do it...
Configuring Nova to boot instances from Ceph RBD
How to do it…
Configuring Nova to attach Ceph RBD
How to do it...
Working with Ceph Object Storage
Introduction
Understanding Ceph object storage
RADOS Gateway standard setup, installation, and configuration
Setting up the RADOS Gateway node
How to do it…
Installing and configuring the RADOS Gateway
How to do it…
Creating the radosgw user
How to do it…
See also…
Accessing the Ceph object storage using S3 API
How to do it…
Configuring DNS
Configuring the s3cmd client
Configure the S3 client (s3cmd) on client-node1
Accessing the Ceph object storage using the Swift API
How to do it...
Integrating RADOS Gateway with OpenStack Keystone
How to do it...
Integrating RADOS Gateway with Hadoop S3A plugin 
How to do it...
Working with Ceph Object Storage Multi-Site v2
Introduction
Functional changes from Hammer federated configuration
RGW multi-site v2 requirement
Installing the Ceph RGW multi-site v2 environment 
How to do it...
Configuring Ceph RGW multi-site v2
How to do it...
Configuring a master zone
Configuring a secondary zone
Checking the synchronization status 
Testing user, bucket, and object sync between master and secondary sites
How to do it...
Working with the Ceph Filesystem
Introduction
Understanding the Ceph Filesystem and MDS
 Deploying Ceph MDS
How to do it...
Accessing Ceph FS through kernel driver
How to do it...
Accessing Ceph FS through FUSE client
How to do it...
Exporting the Ceph Filesystem as NFS
How to do it...
Ceph FS – a drop-in replacement for HDFS
Operating and Managing a Ceph Cluster
Introduction
Understanding Ceph service management
Managing the cluster configuration file
How to do it...
Adding monitor nodes to the Ceph configuration file
Adding an MDS node to the Ceph configuration file
Adding OSD nodes to the Ceph configuration file
Running Ceph with systemd
How to do it...
Starting and stopping all daemons
Querying systemd units on a node
Starting and stopping all daemons by type
Starting and stopping a specific daemon
Scale-up versus scale-out
Scaling out your Ceph cluster
How to do it...
Adding the Ceph OSD
Adding the Ceph MON
There's more...
Scaling down your Ceph cluster
How to do it...
Removing the Ceph OSD
Removing the Ceph MON
Replacing a failed disk in the Ceph cluster
How to do it...
Upgrading your Ceph cluster
How to do it...
Maintaining a Ceph cluster
How to do it...
How it works...
Throttle the backfill and recovery:
Ceph under the Hood
Introduction
Ceph scalability and high availability
Understanding the CRUSH mechanism
CRUSH map internals
How to do it...
How it works...
CRUSH tunables
The evolution of CRUSH tunables
Argonaut – legacy
Bobtail – CRUSH_TUNABLES2
Firefly – CRUSH_TUNABLES3
Hammer – CRUSH_V4
 Jewel – CRUSH_TUNABLES5
Ceph and kernel versions that support given tunables
Warning when tunables are non-optimal
A few important points
Ceph cluster map
High availability monitors
Ceph authentication and authorization
Ceph authentication
Ceph authorization
How to do it…
I/O path from a Ceph client to a Ceph cluster
Ceph Placement Group
How to do it…
Placement Group states
Creating Ceph pools on specific OSDs
How to do it...
The Virtual Storage Manager for Ceph
Introductionc 
Understanding the VSM architecture
The VSM controller
The VSM agent
Setting up the VSM environment
How to do it...
Getting ready for VSM
How to do it...
Installing VSM
How to do it...
Creating a Ceph cluster using VSM
How to do it...
Exploring the VSM dashboard
Upgrading the Ceph cluster using VSM
VSM roadmap
VSM resources
More on Ceph
Introduction
Disk performance baseline
Single disk write performance
How to do it...
Multiple disk write performance
How to do it...
Single disk read performance
How to do it...
Multiple disk read performance
How to do it...
Results
Baseline network performance
How to do it...
Ceph rados bench
How to do it...
How it works...
RADOS load-gen
How to do it...
How it works...
There's more...
Benchmarking the Ceph Block Device
How to do it...
How it works...
Benchmarking Ceph RBD using FIO
How to do it...
Ceph admin socket
How to do it...
Using the ceph tell command
How to do it...
Ceph REST API
How to do it...
Profiling Ceph memory
How to do it...
The ceph-objectstore-tool
How to do it...
How it works...
Using ceph-medic
How to do it...
How it works...
See also
Deploying the experimental Ceph BlueStore
How to do it...
See Also
Deploying Ceph
Preparing your environment with Vagrant and VirtualBox
System requirements
Obtaining and installing VirtualBox
Setting up Vagrant
The ceph-deploy tool
Orchestration
Ansible
Installing Ansible
Creating your inventory file
Variables
Testing
A very simple playbook
Adding the Ceph Ansible modules
Deploying a test cluster with Ansible
Change and configuration management
Summary
BlueStore
What is BlueStore?
Why was it needed?
Ceph's requirements
Filestore limitations
Why is BlueStore the solution?
How BlueStore works
RocksDB
Deferred writes
BlueFS
How to use BlueStore
Upgrading an OSD in your test cluster
Summary
Erasure Coding for Better Storage Efficiency
What is erasure coding?
K+M
How does erasure coding work in Ceph?
Algorithms and profiles
Jerasure
ISA
LRC
SHEC
Where can I use erasure coding?
Creating an erasure-coded pool
Overwrites on erasure code pools with Kraken
Demonstration
Troubleshooting the 2147483647 error
Reproducing the problem
Summary
Developing with Librados
What is librados?
How to use librados?
Example librados application
Example of the librados application with atomic operations
Example of the librados application that uses watchers and notifiers
Summary
Distributed Computation with Ceph RADOS Classes
Example applications and the benefits of using RADOS classes
Writing a simple RADOS class in Lua
Writing a RADOS class that simulates distributed computing
Preparing the build environment
RADOS class
Client librados applications
Calculating MD5 on the client
Calculating MD5 on the OSD via RADOS class
Testing
RADOS class caveats
Summary
Tiering with Ceph
Tiering versus caching
How Cephs tiering functionality works
What is a bloom filter
Tiering modes
Writeback
Forward
Read-forward
Proxy
Read-proxy
Uses cases
Creating tiers in Ceph
Tuning tiering
Flushing and eviction
Promotions
Promotion throttling
Monitoring parameters
Tiering with erasure-coded pools
Alternative caching mechanisms
Summary
Troubleshooting
Repairing inconsistent objects
Full OSDs
Ceph logging
Slow performance
Causes
Increased client workload
Down OSDs
Recovery and backfilling
Scrubbing
Snaptrimming
Hardware or driver issues
Monitoring
iostat
htop
atop
Diagnostics
Extremely slow performance or no IO
Flapping OSDs
Jumbo frames
Failing disks
Slow OSDs
Investigating PGs in a down state
Large monitor databases
Summary
Disaster Recovery
What is a disaster?
Avoiding data loss
What can cause an outage or data loss?
RBD mirroring
The journal
The rbd-mirror daemon
Configuring RBD mirroring
Performing RBD failover
RBD recovery
Lost objects and inactive PGs
Recovering from a complete monitor failure
Using the Cephs object store tool
Investigating asserts
Example assert
Summary
Operations and Maintenance
Topology
The 40,000 foot view
Drilling down
OSD dump
OSD list
OSD find
CRUSH dump
Pools
Monitors
CephFS
Configuration
Cluster naming and configuration
The Ceph configuration file
Admin sockets
Injection
Configuration management
Scrubs
Logs
MON logs
OSD logs
Debug levels
Common tasks
Installation
Ceph-deploy
Flags
Service management
Systemd: the wave (tsunami?) of the future
Upstart
sysvinit
Component failures
Expansion
Balancing
Upgrades
Working with remote hands
Summary
Monitoring Ceph
Monitoring Ceph clusters
Ceph cluster health
Watching cluster events
Utilizing your cluster
OSD variance and fillage
Cluster status
Cluster authentication
Monitoring Ceph MONs
MON status
MON quorum status
Monitoring Ceph OSDs
OSD tree lookup
OSD statistics
OSD CRUSH map
Monitoring Ceph placement groups
PG states
Monitoring Ceph MDS
Open source dashboards and tools
Kraken
Ceph-dash
Decapod
Rook
Calamari
Ceph-mgr
Prometheus and Grafana
Summary
Performance and Stability Tuning
Ceph performance overview
Kernel settings
pid_max
kernel.threads-max, vm.max_map_count
XFS filesystem settings
Virtual memory settings
Network settings
Jumbo frames
TCP and network core
iptables and nf_conntrack
Ceph settings
max_open_files
Recovery
OSD and FileStore settings
MON settings
Client settings
Benchmarking
RADOS bench
CBT
FIO
Fill volume, then random 1M writes for 96 hours, no read verification:
Fill volume, then small block writes for 96 hours, no read verification:
Fill volume, then 4k random writes for 96 hours, occasional read verification:
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
This Learning Path takes you through the basics of Ceph all the way to gain an in-depth understanding of its advanced features. You'll gather skills to plan, deploy, and manage your Ceph cluster. After an introduction to the Ceph architecture and its core projects, you'll be able to set up a Ceph cluster and learn how to monitor its health, improve its performance, and troubleshoot any issues.
By following the step-by-step approach of this Learning Path, you'll learn how Ceph integrates with OpenStack, Glance, Manila, Swift, and Cinder. With knowledge of federated architecture and CephFS, you'll use Calamari and VSM to monitor the Ceph environment. In the upcoming chapters, you'll study the key areas of Ceph, including BlueStore, erasure coding, and cache tiering. More specifically, you'll discover what they can do for your storage system. In the concluding chapters, you will develop applications that use Librados and distributed computations with shared object classes, and see how Ceph and its supporting infrastructure can be optimized. By the end of this Learning Path, you'll have the practical knowledge of operating Ceph in a production environment.
If you are a developer, system administrator, storage professional, or cloud engineer who wants to understand how to deploy a Ceph cluster, this Learning Path is ideal for you. It will help you discover ways in which Ceph features can solve your data storage problems. Basic knowledge of storage systems and GNU/Linux will be beneficial.
Chapter 1, Ceph - Introduction and Beyond, covers an introduction to Ceph, gradually moving toward RAID and its challenges, and a Ceph architectural overview. Finally, we will go through Ceph installation and configuration.
Chapter 2, Working with Ceph Block Device, covers an introduction to the Ceph Block Device and provisioning of the Ceph block device. We will also go through RBD snapshots and clones, as well as implementing a disaster-recovery solution with RBD mirroring.
Chapter 3, Working with Ceph and Openstack, covers configuring Openstack clients for use with Ceph, as well as storage options for OpenStack using cinder, glance, and nova.
Chapter 4, Working with Ceph Object Storage, covers a deep dive into Ceph object storage, including RGW setup and configuration, S3, and OpenStack Swift access. Finally, we will set up RGW with the Hadoop S3A plugin.
Chapter 5, Working with Ceph Object Storage Multi-Site v2, helps you to deep dive into the new Multi-site v2, while configuring two Ceph clusters to mirror objects between them in an object disaster recovery solution.
Chapter 6, Working with the Ceph Filesystem, covers an introduction to CephFS, deploying and accessing MDS and CephFS via kerenel, FUSE, and NFS-Ganesha.
Chapter 7, Operating and Managing a Ceph Cluster, covers Ceph service management with systemd, and scaling up and scaling down a Ceph cluster. This chapter also includes failed disk replacement and upgrading Ceph infrastructures.
Chapter 8, Ceph under the Hood, explores the Ceph CRUSH map, understanding the internals of the CRUSH map and CRUSH tunables, followed by Ceph authentication and authorization. This chapter also covers dynamic cluster management and understanding Ceph PG. Finally, we create the specifics required for specific hardware.
Chapter 9, The Virtual Storage Manager for Ceph, speaks about Virtual Storage Manager (VSM), covering it’s introduction and architecture. We will also go through the deployment of VSM and then the creation of a Ceph cluster, using VSM to manage it.
Chapter 10, More on Ceph, covers Ceph benchmarking, Ceph troubleshooting using admin socket, API, and the ceph-objectstore tool. This chapter also covers the deployment of Ceph using Ansible and Ceph memory profiling. Furthermore, it covers health checking your Ceph cluster using Ceph Medic and the new experimental backend Ceph BlueStore.
Chapter 11, Deploying Ceph, is a no-nonsense step-by-step instructional chapter on how to set up a Ceph cluster. This chapter covers the ceph-deploy tool for testing and goes onto covering Ansible. A section on change management is also included, and it explains how this is essential for the stability of large Ceph clusters.
Chapter 12, BlueStore, explains that Ceph has to be able to provide atomic operations around data and metadata and how filestore was built to provide these guarantees on top of standard filesystems. We will also cover the problems around this approach. The chapter then introduces BlueStore and explains how it works and the problems that it solves. This will include the components and how they interact with different types of storage devices. We will also have an overview of key-value stores, including RocksDB, which is used by BlueStore. Some of the BlueStore settings and how they interact with different hardware configurations will be discussed.
Chapter 13, Erasure Coding for Better Storage Efficiency, covers how erasure coding works and how it's implemented in Ceph, including explanations of RADOS pool parameters and erasure coding profiles. A reference to the changes in the Kraken release will highlight the possibility of append-overwrites to erasure pools, which will allow RBDs to directly function on erasure-coded pools. Performance considerations will also be explained. This will include references to BlueStore, as it is required for sufficient performance. Finally, we have step-by-step instructions on actually setting up erasure coding on a pool, which can be used as a mechanical reference for sysadmins.
Chapter 14, Developing with Librados, explains how Librados is used to build applications that can interact directly with a Ceph cluster. It then moves onto several different examples of using Librados in different languages to give you an idea of how it can be used, including atomic transactions.
Chapter 15, Distributed Computation with Ceph RADOS Classes, discusses the benefits of moving processing directly into the OSD to effectively perform distributed computing. It then covers how to get started with RADOS classes by building simple ones with Lua. It then covers how to build your own C++ RADOS class into the Ceph source tree and conduct benchmarks against performing processing on the client versus the OSD.
Chapter 16, Tiering with Ceph, explains how RADOS tiering works in Ceph, where it should be used, and its pitfalls. It takes you step-by-step through configuring tiering on a Ceph cluster and finally covers the tuning options to extract the best performance for tiering. An example using Graphite will demonstrate the value of being able to manipulate captured data to provide more meaningful output in graph form.
Chapter 17, Troubleshooting, explains how although Ceph is largely autonomous in taking care of itself and recovering from failure scenarios, in some cases, human intervention is required. We'll look at common errors and failure scenarios and how to bring Ceph back to full health by troubleshooting them.
Chapter 18, Disaster Recovery, covers situations when Ceph is in such a state that there is a complete loss of service or data loss has occurred. Less familiar recovery techniques are required to restore access to the cluster and, hopefully, recover data. This chapter arms you with the knowledge to attempt recovery in these scenarios.
Chapter 19, Operations and Maintenance, is a deep and wide inventory of day to day operations. We cover management of Ceph topologies, services, and configuration settings as well as, maintenance and debugging.
Chapter 20, Monitoring Ceph, a comprehensive collection of commands, practices, and dashboard software to help keep a close eye on the health of Ceph clusters.
Chapter 21, Performance and Stability Tuning, provides a collection of Ceph, networks, filesystems, and underlying operating system settings to optimize cluster performance and stability. Benchmarking of cluster performance is also explored.
This book requires that you have enough resources to run the whole Ceph lab environment. The minimum hardware or virtual requirements are as follows:
CPU: 2 cores
Memory: 8 GB RAM
Disk space: 40 GB
The various software components required to follow the instructions in the chapters are as follows:
VirtualBox 4.0 or higher (https://www.virtualbox.org/wiki/Downloads)
GIT (http://www.git-scm.com/downloads)
Vagrant 1.5.0 or higher (https://www.vagrantup.com/downloads.html)
CentOS operating system 7.0 or higher (http://wiki.centos.org/Download)
Ceph software Jewel packages Version 10.2.0 or higher (http://ceph.com/resources/downloads/)
S3 Client, typically S3cmd (http://s3tools.org/download)
Python-swift client
NFS Ganesha
Ceph Fuse
CephMetrics (https://github.com/ceph/cephmetrics
Ceph-Medic (https://github.com/ceph/ceph-medic)
Virtual Storage Manager 2.0 or higher (https://github.com/01org/virtual-storagemanager/releases/tag/v2.1.0)
Ceph-Ansible (https://github.com/ceph/ceph-ansible)
OpenStack RDO (http://rdo.fedorapeople.org/rdo-release.rpm)
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
Log in or register at
www.packt.com
.
Select the
SUPPORT
tab.
Click on
Code Downloads & Errata
.
Enter the name of the book in the
Search
box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
WinRAR/7-Zip for Windows
Zipeg/iZip/UnRarX for Mac
7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Ceph-Designing-and-Implementing-Scalable-Storage-Systems. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning. Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Verify the installrc file" A block of code is set as follows:
AGENT_ADDRESS_LIST="192.168.123.101 192.168.123.102 192.168.123.103"
CONTROLLER_ADDRESS="192.168.123.100"
Any command-line input or output is written as follows:
vagrant plugin install vagrant-hostmanager
New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "It has probed OSDs 1 and 2 for the data, which means that it didn’t find anything it needed. It wants to try and pol OSD 0, but it can’t because the OSD is down, hence the message as starting or marking this osd lost may let us proceed appeared."
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at [email protected].
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.
In this chapter, we will cover the following recipes:
Ceph – the beginning of a new era
RAID – the end of an era
Ceph – the architectural overview
Planning a Ceph deployment
Setting up a virtual infrastructure
Installing and configuring Ceph
Scaling up your Ceph cluster
Using Ceph clusters with a hands-on approach
Ceph is currently the hottest software-defined storage (SDS) technology and is shaking up the entire storage industry. It is an open source project that provides unified software-defined solutions for block, file, and object storage. The core idea of Ceph is to provide a distributed storage system that is massively scalable and high performing with no single point of failure. From the roots, it has been designed to be highly scalable (up to the exabyte level and beyond) while running on general-purpose commodity hardware.
Ceph is acquiring most of the traction in the storage industry due to its open, scalable, and reliable nature. This is the era of cloud computing and software-defined infrastructure, where we need a storage backend that is purely software-defined and, more importantly, cloud-ready. Ceph fits in here very well, regardless of whether you are running a public, private, or hybrid cloud.
Today's software systems are very smart and make the best use of commodity hardware to run gigantic-scale infrastructures. Ceph is one of them; it intelligently uses commodity hardware to provide enterprise-grade robust and highly reliable storage systems.
Ceph has been raised and nourished with the help of the Ceph upstream community with an architectural philosophy that includes the following:
Every component must scale linearly
There should not be any single point of failure
The solution must be software-based, open source, and adaptable
The Ceph software should run on readily available commodity hardware
Every component must be self-managing and self-healing wherever possible
The foundation of Ceph lies in objects, which are its building blocks. Object storage such as Ceph is the perfect provision for current and future needs for unstructured data storage. Object storage has its advantages over traditional storage solutions; we can achieve platform and hardware independence using object storage. Ceph plays meticulously with objects and replicates them across the cluster to avail reliability; in Ceph, objects are not tied to a physical path, making object location independent. This flexibility enables Ceph to scale linearly from the petabyte to the exabyte level.
Ceph provides great performance, enormous scalability, power, and flexibility to organizations. It helps them get rid of expensive proprietary storage silos. Ceph is indeed an enterprise-class storage solution that runs on commodity hardware; it is a low-cost yet feature-rich storage system. Ceph's universal storage system provides block, file, and object storage under one hood, enabling customers to use storage as they want.
In the following section, we will learn about Ceph releases.
Ceph is being developed and improved at a rapid pace. On July 3, 2012, Sage announced the first LTS release of Ceph with the code name Argonaut. Since then, we have seen 12 new releases come up. Ceph releases are categorized as Long Term Support (LTS), and stable releases and every alternate Ceph release are LTS releases. For more information, visit https://Ceph.com/category/releases/.
Ceph release name
Ceph release version
Released On
Argonaut
V0.48 (LTS)
July 3, 2012
Bobtail
V0.56 (LTS)
January 1, 2013
Cuttlefish
V0.61
May 7, 2013
Dumpling
V0.67 (LTS)
August 14, 2013
Emperor
V0.72
November 9, 2013
Firefly
V0.80 (LTS)
May 7, 2014
Giant
V0.87.1
Feb 26, 2015
Hammer
V0.94 (LTS)
April 7, 2015
Infernalis
V9.0.0
May 5, 2015
Jewel
V10.0.0 (LTS)
Nov, 2015
Kraken
V11.0.0
June 2016
Luminous
V12.0.0 (LTS)
Feb 2017
Data storage requirements have grown explosively over the last few years. Research shows that data in large organizations is growing at a rate of 40 to 60 percent annually, and many companies are doubling their data footprint each year. IDC analysts have estimated that worldwide, there were 54.4 exabytes of total digital data in the year 2000. By 2007, this reached 295 exabytes, and by 2020, it's expected to reach 44 zettabytes worldwide. Such data growth cannot be managed by traditional storage systems; we need a system such as Ceph, which is distributed, scalable and most importantly, economically viable. Ceph has been specially designed to handle today's as well as the future's data storage needs.
SDS is what is needed to reduce TCO for your storage infrastructure. In addition to reduced storage cost, SDS can offer flexibility, scalability, and reliability. Ceph is a true SDS solution; it runs on commodity hardware with no vendor lock-in and provides low cost per GB. Unlike traditional storage systems, where hardware gets married to software, in SDS, you are free to choose commodity hardware from any manufacturer and are free to design a heterogeneous hardware solution for your own needs. Ceph's software-defined storage on top of this hardware provides all the intelligence you need and will take care of everything, providing all the enterprise storage features right from the software layer.
One of the drawbacks of a cloud infrastructure is the storage. Every cloud infrastructure needs a storage system that is reliable, low-cost, and scalable with a tighter integration than its other cloud components. There are many traditional storage solutions out there in the market that claim to be cloud-ready, but today, we not only need cloud readiness, but also a lot more beyond that. We need a storage system that should be fully integrated with cloud systems and can provide lower TCO without any compromise to reliability and scalability. Cloud systems are software-defined and are built on top of commodity hardware; similarly, it needs a storage system that follows the same methodology, that is, being software-defined on top of commodity hardware, and Ceph is the best choice available for cloud use cases.
Ceph has been rapidly evolving and bridging the gap of a true cloud storage backend. It is grabbing the center stage with every major open source cloud platform, namely OpenStack, CloudStack, and OpenNebula. Moreover, Ceph has succeeded in building up beneficial partnerships with cloud vendors such as Red Hat, Canonical, Mirantis, SUSE, and many more. These companies are favoring Ceph big time and including it as an official storage backend for their cloud OpenStack distributions, thus making Ceph a red-hot technology in cloud storage space.
The OpenStack project is one of the finest examples of open source software powering public and private clouds. It has proven itself as an end-to-end open source cloud solution. OpenStack is a collection of programs, such as Cinder, Glance, and Swift, which provide storage capabilities to OpenStack. These OpenStack components require a reliable, scalable, and all in one storage backend such as Ceph. For this reason, OpenStack and Ceph communities have been working together for many years to develop a fully compatible Ceph storage backend for the OpenStack.
Cloud infrastructure based on Ceph provides much-needed flexibility to service providers to build Storage-as-a-Service and Infrastructure-as-a-Service solutions, which they cannot achieve from other traditional enterprise storage solutions as they are not designed to fulfill cloud needs. Using Ceph, service providers can offer low-cost, reliable cloud storage to their customers.
The definition of unified storage has changed lately. A few years ago, the term unified storage referred to providing file and block storage from a single system. Now because of recent technological advancements, such as cloud computing, big data, and the internet of Things, a new kind of storage has been evolving, that is, object storage. Thus, all storage systems that do not support object storage are not really unified storage solutions. True unified storage is like Ceph; it supports blocks, files, and object storage from a single system.
In Ceph, the term unified storage is more meaningful than what existing storage vendors claim to provide. It has been designed from the ground up to be future-ready, and it's constructed such that it can handle enormous amounts of data. When we call Ceph future ready, we mean to focus on its object storage capabilities, which is a better fit for today's mix of unstructured data rather than blocks or files.
Everything in Ceph relies on intelligent objects, whether it's block storage or file storage. Rather than managing blocks and files underneath, Ceph manages objects and supports block-and-file-based storage on top of it. Objects provide enormous scaling with increased performance by eliminating metadata operations. Ceph uses an algorithm to dynamically compute where the object should be stored and retrieved from.
The traditional storage architecture of SAN and NAS systems is very limited. Basically, they follow the tradition of controller high availability; that is, if one storage controller fails, it serves data from the second controller. But, what if the second controller fails at the same time, or even worse if the entire disk shelf fails? In most cases, you will end up losing your data. This kind of storage architecture, which cannot sustain multiple failures, is definitely what we do not want today. Another drawback of traditional storage systems is their data storage and access mechanism. They maintain a central lookup table to keep track of metadata, which means that every time a client sends a request for a read or write operation, the storage system first performs a lookup in the huge metadata table, and after receiving the real data location, it performs the client operation. For a smaller storage system, you might not notice performance hits, but think of a large storage cluster—you would definitely be bound by performance limits with this approach. This would even restrict your scalability.
Ceph does not follow this traditional storage architecture; in fact, the architecture has been completely reinvented. Rather than storing and manipulating metadata, Ceph introduces a newer way: the CRUSH algorithm. CRUSH stands for Controlled Replication Under Scalable Hashing. Instead of performing a lookup in the metadata table for every client request, the CRUSH algorithm computes on demand where the data should be written to or read from. By computing metadata, the need to manage a centralized table for metadata is no longer there. Modern computers are amazingly fast and can perform a CRUSH lookup very quickly; moreover, this computing load, which is generally not too much, can be distributed across cluster nodes, leveraging the power of distributed storage. In addition to this, CRUSH has a unique property, which is infrastructure awareness. It understands the relationship between various components of your infrastructure and stores your data in a unique failure zone, such as a disk, node, rack, row, and data center room, among others. CRUSH stores all the copies of your data such that it is available even if a few components fail in a failure zone. It is due to CRUSH that Ceph can handle multiple component failures and provide reliability and durability.
The CRUSH algorithm makes Ceph self-managing and self-healing. In the event of component failure in a failure zone, CRUSH senses which component has failed and determines the effect on the cluster. Without any administrative intervention, CRUSH self-manages and self-heals by performing a recovering operation for the data lost due to failure. CRUSH regenerates the data from the replica copies that the cluster maintains. If you have configured the Ceph CRUSH map in the correct order, it makes sure that at least one copy of your data is always accessible. Using CRUSH, we can design a highly reliable storage infrastructure with no single point of failure. This makes Ceph a highly scalable and reliable storage system that is future-ready. CRUSH is covered more in detail in Chapter 8, Ceph Under the Hood.
The RAID technology has been the fundamental building block for storage systems for years. It has proven successful for almost every kind of data that has been generated in the last 3 decades. But all eras must come to an end, and this time, it's RAID's turn. These systems have started showing limitations and are incapable of delivering to future storage needs. In the course of the last few years, cloud infrastructures have gained strong momentum and are imposing new requirements on storage and challenging traditional RAID systems. In this section, we will uncover the limitations imposed by RAID systems.
The most painful thing in a RAID technology is its super-lengthy rebuild process. Disk manufacturers are packing lots of storage capacity per disk. They are now producing an extra-large capacity of disk drives at a fraction of the price. We no longer talk about 450 GB, 600 GB, or even 1 TB disks, as there is a larger capacity of disks available today. The newer enterprise disk specification offers disks up to 4 TB, 6 TB, and even 10 TB disk drives, and the capacities keep increasing year by year.
Think of an enterprise RAID-based storage system that is made up of numerous 4 TB or 6 TB disk drives. Unfortunately, when such a disk drive fails, RAID will take several hours and even up to days to repair a single failed disk. Meanwhile, if another drive fails from the same RAID group, then it would become a chaotic situation. Repairing multiple large disk drives using RAID is a cumbersome process.
The RAID system requires a few disks as hot spare disks. These are just free disks that will be used only when a disk fails; else, they will not be used for data storage. This adds extra cost to the system and increases TCO. Moreover, if you're running short of spare disks and immediately a disk fails in the RAID group, then you will face a severe problem.
RAID requires a set of identical disk drivers in a single RAID group; you would face penalties if you change the disk size, rpm, or disk type. Doing so would adversely affect the capacity and performance of your storage system. This makes RAID highly choosy about the hardware.
Also, enterprise RAID-based systems often require expensive hardware components, such as RAID controllers, which significantly increases the system cost. These RAID controllers will become single points of failure if you do not have many of them.
RAID can hit a dead end when it's not possible to grow the RAID group size, which means that there is no scale-out support. After a point, you cannot grow your RAID-based system, even though you have money. Some systems allow the addition of disk shelves but up to a very limited capacity; however, these new disk shelves put a load on the existing storage controller. So, you can gain some capacity but with a performance trade-off.
RAID can be configured with a variety of different types; the most common types are RAID5 and RAID6, which can survive the failure of one and two disks, respectively. RAID cannot ensure data reliability after a two-disk failure. This is one of the biggest drawbacks of RAID systems.
Moreover, at the time of a RAID rebuild operation, client requests are most likely to starve for I/O until the rebuild completes. Another limiting factor with RAID is that it only protects against disk failure; it cannot protect against a failure of the network, server hardware, OS, power, or other data center disasters.
After discussing RAID's drawbacks, we can come to the conclusion that we now need a system that can overcome all these drawbacks in performance and cost-effective way. The Ceph storage system is one of the best solutions available today to address these problems. Let's see how.
For reliability, Ceph makes use of the data replication method, which means it does not use RAID, thus overcoming all the problems that can be found in a RAID-based enterprise system. Ceph is a software-defined storage, so we do not require any specialized hardware for data replication; moreover, the replication level is highly customized by means of commands, which means that the Ceph storage administrator can manage the replication factor of a minimum of one and a maximum of a higher number, totally depending on the underlying infrastructure.
In an event of one or more disk failures, Ceph's replication is a better process than RAID. When a disk drive fails, all the data that was residing on that disk at that point of time start recovering from its peer disks. Since Ceph is a distributed system, all the data copies are scattered on the entire cluster of disks in the form of objects, such that no two object's copies should reside on the same disk and must reside in a different failure zone defined by the CRUSH map. The good part is that all the cluster disks participate in data recovery. This makes the recovery operation amazingly fast with the least performance problems. Furthermore, the recovery operation does not require any spare disks; the data is simply replicated to other Ceph disks in the cluster. Ceph uses a weighting mechanism for its disks, so different disk sizes is not a problem.
In addition to the replication method, Ceph also supports another advanced way of data reliability: using the erasure-coding technique. Erasure-coded pools require less storage space compared to replicated pools. In erasure-coding, data is recovered or regenerated algorithmically by erasure code calculation. You can use both the techniques of data availability, that is, replication as well as erasure-coding, in the same Ceph cluster but over different storage pools. We will learn more about the erasure-coding technique in the upcoming chapters.
The Ceph internal architecture is pretty straightforward, and we will learn about it with the help of the following diagram:
Ceph monitors (MON)
: Ceph monitors track the health of the entire cluster by keeping a map of the cluster state. They maintain a separate map of information for each component, which includes an OSD map, MON map, PG map (discussed in later chapters), and CRUSH map. All the cluster nodes report to monitor nodes and share information about every change in their state. The monitor does not store actual data; this is the job of the OSD.
Ceph object storage device (OSD)
: As soon as your application issues a write operation to the Ceph cluster, data gets stored in the OSD in the form of objects.
This is the only component of the Ceph cluster where actual user data is stored, and the same data is retrieved when the client issues a read operation. Usually, one OSD daemon is tied to one physical disk in your cluster. So in general, the total number of physical disks in your Ceph cluster is the same as the number of OSD daemons working underneath to store user data on each physical disk.
Ceph metadata server (MDS)
: The MDS keeps track of file hierarchy and stores metadata only for the CephFS filesystem. The Ceph block device and RADOS gateway do not require metadata; hence, they do not need the Ceph MDS daemon. The MDS does not serve data directly to clients, thus removing the single point of failure from the system.
RADOS
: The
Reliable Autonomic Distributed Object Store (RADOS)
is the foundation of the Ceph storage cluster. Everything in Ceph is stored in the form of objects, and the RADOS object store is responsible for storing these objects irrespective of their data types. The RADOS layer makes sure that data always remains consistent. To do this, it performs data replication, failure detection, and recovery, as well as data migration and rebalancing across cluster nodes.
librados
: The librados library is a convenient way to gain access to RADOS with support to the PHP, Ruby, Java, Python, C, and C++ programming languages. It provides a native interface for the Ceph storage cluster (RADOS) as well as a base for other services, such as RBD, RGW, and CephFS, which are built on top of librados. Librados also supports direct access to RADOS from applications with no HTTP overhead.
RADOS block devices (RBDs)
: RBDs, which are now known as the Ceph block device, provide persistent block storage, which is thin-provisioned, resizable, and stores data striped over multiple OSDs. The RBD service has been built as a native interface on top of librados.
RADOS gateway interface (RGW)
: RGW provides object storage service. It uses librgw (the Rados Gateway Library) and librados, allowing applications to establish connections with the Ceph object storage. The RGW provides RESTful APIs with interfaces that are compatible with Amazon S3 and OpenStack Swift.
CephFS
: The Ceph filesystem provides a POSIX-compliant filesystem that uses the Ceph storage cluster to store user data on a filesystem. Like RBD and RGW, the CephFS service is also implemented as a native interface to librados.
Ceph manager
: The
Ceph manager
daemon (ceph-mgr) was introduced in the Kraken release, and it runs alongside monitor daemons to provide additional monitoring and interfaces to external monitoring and management systems.
A Ceph storage cluster is created on top of the commodity hardware. This commodity hardware includes industry-standard servers loaded with physical disk drives that provide storage capacity and some standard networking infrastructure. These servers run standard Linux distributions and Ceph software on top of them. The following diagram helps you understand the basic view of a Ceph cluster:
As explained earlier, Ceph does not have a very specific hardware requirement. For the purpose of testing and learning, we can deploy a Ceph cluster on top of virtual machines. In this section and in the later chapters of this book, we will be working on a Ceph cluster that is built on top of virtual machines. It's very convenient to use a virtual environment to test Ceph, as it's fairly easy to set up and can be destroyed and recreated anytime. It's good to know that a virtual infrastructure for the Ceph cluster should not be used for a production environment, and you might face serious problems with this.
To set up a virtual infrastructure, you will require open source software, such as Oracle VirtualBox and Vagrant, to automate virtual machine creation for you. Make sure you have the software installed and working correctly on your host machine. The installation processes of the software are beyond the scope of this book; you can follow their respective documentation in order to get them installed and working correctly.
You will need the following software to get started:
Oracle VirtualBox
: This is an open source virtualization software package for host machines based on x86 and AMD64/Intel64. It supports Microsoft Windows, Linux, and Apple macOS X host operating systems. Make sure it's installed and working correctly. More information can be found at
https://www.virtualbox.org
.
Once you have installed VirtualBox, run the following command to ensure the installation was successful:
# VBoxManage --version
Vagrant
: This is software meant for creating virtual development environments. It works as a wrapper around virtualization software, such as VirtualBox, VMware, KVM, and so on. It supports the Microsoft Windows, Linux, and Apple macOS X host operating systems. Make sure it's installed and working correctly. More information can be found at
https://www.vagrantup.com/
. Once you have installed Vagrant, run the following command to ensure the installation was successful:
# vagrant --version
Git
: This is a distributed revision control system and the most popular and widely adopted version control system for software development. It supports Microsoft Windows, Linux, and Apple macOS X operating systems. Make sure it's installed and working correctly. More information can be found at
http://git-scm.com/
.
Once you have installed Git, run the following command to ensure the installation was successful:
# git --version
Once you have installed the mentioned software, we will proceed with virtual machine creation:
git clone
Ceph-Designing-and-Implementing-Scalable-Storage-Sytems repository to your VirtualBox host machine:
$ git clone https://github.com/PacktPublishing/Ceph-Designing-and-Implementing-Scalable-Storage-Systems
Under the directory, you will find
vagrantfile
, which is our Vagrant configuration file that basically instructs VirtualBox to launch the VMs that we require at different stages of this book. Vagrant will automate the VM's creation, installation, and configuration for you; it makes the initial environment easy to set up:
$ cd Ceph-Designing-and-Implementing-Scalable-Storage-Systems ; ls -l
Next, we will launch three VMs using Vagrant; they are required throughout this chapter:
$ vagrant up ceph-node1 ceph-node2 ceph-node3
Run
vagrant up ceph-node1 ceph-node2 ceph-node3
.
Check the status of your virtual machines:
$ vagrant status ceph-node1 ceph-node2 ceph-node3
Vagrant will, by default, set up hostnames as
ceph-node<node_number>
and IP address subnet as
192.168.1.X
and will create three additional disks that will be used as OSDs by the Ceph cluster. Log in to each of these machines one by one and check whether the hostname, networking, and additional disks have been set up correctly by Vagrant:
$ vagrant ssh ceph-node1 $ ip addr show $ sudo fdisk -l $ exit
Vagrant is configured to update
hosts
file on the VMs. For convenience, update the
/etc/hosts
file on your host machine with the following content:
192.168.1.101 ceph-node1
192.168.1.102 ceph-node2
192.168.1.103 ceph-node3
Update all the three VM's to the latest CentOS release and reboot to the latest kernel.
Generate root SSH keys for
ceph-node1
and copy the keys to
ceph-node2
and
ceph-node3
. The password for the root user on these VMs is
vagrant
. Enter the root user password when asked by the
ssh-copy-id
command and proceed with the default settings:
$ vagrant ssh ceph-node1 $ sudo su - # ssh-keygen # ssh-copy-id root@ceph-node1 # ssh-copy-id root@ceph-node2 # ssh-copy-id root@ceph-node3
Once the SSH keys are copied to
ceph-node2
and
ceph-node3
, the root user from
ceph-node1
can do an
ssh
login to VMs without entering the password:
# ssh ceph-node2 hostname # ssh ceph-node3 hostname
Enable ports that are required by the Ceph MON, OSD, and MDS on the operating system's firewall. Execute the following commands on all VMs:
# firewall-cmd --zone=public --add-port=6789/tcp --permanent # firewall-cmd --zone=public --add-port=6800-7100/tcp --permanent # firewall-cmd --reload # firewall-cmd --zone=public --list-all
Install and configure NTP on all VMs:
# yum install ntp ntpdate -y # ntpdate pool.ntp.org # systemctl restart ntpdate.service # systemctl restart ntpd.service # systemctl enable ntpd.service # systemctl enable ntpdate.service
To deploy our first Ceph cluster, we will use theceph-ansibletool to install and configure Ceph on all three virtual machines. Theceph-ansibletool is a part of the Ceph project, which is used for easy deployment and management of your Ceph storage cluster. In the previous section, we created three virtual machines with CentOS 7, which have connectivity with the internet over NAT, as well as private host-only networks.
We will configure these machines as Ceph storage clusters, as mentioned in the following diagram:
We will first install Ceph and configure ceph-node1 as the Ceph monitor and the Ceph OSD node. Later recipes in this chapter will introduce ceph-node2 and ceph-node3.
Copy ceph-ansible package on ceph-node1 from the Ceph-Cookbook-Second-Edition directory.
Use
vagrant
as the password for the root user:
# cd Ceph-Designing-and-Implementing-Scalable-Storage-Systems # scp ceph-ansible-2.2.10-38.g7ef908a.el7.noarch.rpm root@ceph-node1:/root
Log in to
ceph-node1
and install
ceph-ansible
on
ceph-node1
:
[root@ceph-node1 ~]#
yum install ceph-ansible-2.2.10-38.g7ef908a.el7.noarch.rpm -y
Update the Ceph hosts to
/etc/ansible/hosts
:
Verify that Ansible can reach the Ceph hosts mentioned in
/etc/ansible/hosts
:
Create a directory under the root home directory so Ceph Ansible can use it for storing the keys:
Create a symbolic link to the Ansible
group_vars
directory in the
/etc/ansible/
directory:
Go to
/etc/ansible/group_vars
and copy an
all.yml
file from the
all.yml.sample
file and open it to define configuration options' values:
Define the following configuration options in
all.yml
for the latest jewel version on CentOS 7:
Go to
/etc/ansible/group_vars
and copy an
osds.yml
file from the
osds.yml.sample
file and open it to define configuration options' values:
Define the following configuration options in
osds.yml
for OSD disks; we are co-locating an OSD journal in the OSD data disk:
Go to
/usr/share/ceph-ansible
and add
retry_files_save_path
option in
ansible.cfg
in the
[defaults]
tag:
Run Ansible playbook in order to deploy the Ceph cluster on
ceph-node1
:
To run the playbook, you need site.yml, which is present in the same path: /usr/share/ceph-ansible/. You should be in the /usr/share/ceph-ansible/ path and should run following commands:
# cp site.yml.sample site.yml
# ansible-playbook site.yml
Once playbook completes the Ceph cluster installation job and plays the recap with failed=0, it means ceph-ansible
