Mastering Ceph by Fisk, Nick -- Read -- Imperial Library of Trantor

Index

Title Page Copyright and Credits

Mastering Ceph Second Edition

About Packt

Why subscribe? Packt.com

Contributors

About the author About the reviewer Packt is searching for authors like you

Preface

Who this book is for What this book covers To get the most out of this book

Download the example code files Download the color images Conventions used

Get in touch

Reviews

Section 1: Planning And Deployment Planning for Ceph

What is Ceph? How Ceph works Ceph use cases

Specific use cases

OpenStack or KVM based virtualization Large bulk block storage Object storage Object storage with custom applications Distributed filesystem – web farm Distributed filesystem – NAS or fileserver replacement Big data

Infrastructure design

SSDs

Enterprise SSDs

Enterprise – read-intensive Enterprise – general usage Enterprise – write-intensive

Memory CPU Disks Networking

10 G requirement Network design

OSD node sizes

Failure domains Price

Power supplies

How to plan a successful Ceph implementation

Understanding your requirements and how they relate to Ceph Defining goals so that you can gauge whether the project is a success Joining the Ceph community Choosing your hardware Training yourself and your team to use Ceph Running a PoC to determine whether Ceph has met the requirements Following best practices to deploy your cluster Defining a change management process Creating a backup and recovery plan

Summary Questions

Deploying Ceph with Containers

Technical requirements Preparing your environment with Vagrant and VirtualBox

How to install VirtualBox How to set up Vagrant Ceph-deploy

Orchestration Ansible

Installing Ansible Creating your inventory file Variables Testing

A very simple playbook Adding the Ceph Ansible modules

Deploying a test cluster with Ansible

Change and configuration management Ceph in containers

Containers Kubernetes

Deploying a Ceph cluster with Rook

Summary Questions

BlueStore

What is BlueStore? Why was it needed?

Ceph's requirements

Filestore limitations

Why is BlueStore the solution?

How BlueStore works

RocksDB Compression Checksums BlueStore cache tuning Deferred writes BlueFS ceph-volume

How to use BlueStore

Strategies for upgrading an existing cluster to BlueStore Upgrading an OSD in your test cluster

Summary Questions

Ceph and Non-Native Protocols

Block File Examples

Exporting Ceph RBDs via iSCSI Exporting CephFS via Samba Exporting CephFS via NFS

ESXi hypervisor Clustering

Split brain Fencing Pacemaker and corosync Creating a highly available NFS share backed by CephFS

Summary Questions

Section 2: Operating and Tuning RADOS Pools and Client Access

Pools

Replicated pools Erasure code pools

What is erasure coding? K+M How does erasure coding work in Ceph? Algorithms and profiles

Jerasure ISA LRC SHEC

Overwrite support in erasure-coded pools Creating an erasure-coded pool Troubleshooting the 2147483647 error Reproducing the problem

Scrubbing

Ceph storage types

RBD

Thin provisioning Snapshots and clones Object maps Exclusive locking

CephFS

MDSes and their states Creating a CephFS filesystem How is data stored in CephFS? File layouts Snapshots Multi-MDS

RGW

Deploying RGW

Summary Questions

Developing with Librados

What is librados? How to use librados Example librados application

Example of the librados application with atomic operations Example of the librados application that uses watchers and notifiers

Summary Questions

Distributed Computation with Ceph RADOS Classes

Example applications and the benefits of using RADOS classes Writing a simple RADOS class in Lua Writing a RADOS class that simulates distributed computing

Preparing the build environment RADOS classes Client librados applications

Calculating MD5 on the client Calculating MD5 on the OSD via the RADOS class

Testing

RADOS class caveats Summary Questions

Monitoring Ceph

Why it is important to monitor Ceph What should be monitored

Ceph health Operating system and hardware Smart stats Network Performance counters

The Ceph dashboard PG states – the good, the bad, and the ugly

The good ones

The active state The clean state Scrubbing and deep scrubbing

The bad ones

The inconsistent state The backfilling, backfill_wait, recovering, and recovery_wait states The degraded state Remapped Peering

The ugly ones

The incomplete state The down state The backfill_toofull and recovery_toofull state

Monitoring Ceph with collectd

Graphite Grafana collectd Deploying collectd with Ansible Sample Graphite queries for Ceph

Number of Up and In OSDs Showing the most deviant OSD usage Total number of IOPs across all OSDs Total MBps across all OSDs Cluster capacity and usage Average latency

Custom Ceph collectd plugins

Summary Questions

Tuning Ceph

Latency

Client to Primary OSD Primary OSD to Replica OSD(s) Primary OSD to Client

Benchmarking

Benchmarking tools Network benchmarking Disk benchmarking RADOS benchmarking RBD benchmarking

Recommended tunings

CPU BlueStore

WAL deferred writes

Filestore

VFS cache pressure WBThrottle and/or nr_requests Throttling filestore queues

filestore_queue_low_threshhold filestore_queue_high_threshhold filestore_expected_throughput_ops filestore_queue_high_delay_multiple filestore_queue_max_delay_multiple

Splitting PGs

Scrubbing OP priorities The network General system tuning Kernel RBD

Queue depth readahead

Tuning CephFS RBDs and erasure-coded pools PG distributions

Summary Questions

Tiering with Ceph

Tiering versus caching

How Ceph's tiering functionality works

What is a bloom filter? Tiering modes

Writeback Forward

Read-forward

Proxy

Read-proxy

Uses cases Creating tiers in Ceph Tuning tiering

Flushing and eviction

Promotions

Promotion throttling

Monitoring parameters Alternative caching mechanisms

Summary Questions

Section 3: Troubleshooting and Recovery Troubleshooting

Repairing inconsistent objects Full OSDs Ceph logging Slow performance

Causes

Increased client workload Down OSDs Recovery and backfilling Scrubbing Snaptrimming Hardware or driver issues

Monitoring

iostat htop atop

Diagnostics

Extremely slow performance or no IO

Flapping OSDs Jumbo frames Failing disks Slow OSDs Out of capacity

Investigating PGs in a down state Large monitor databases Summary Questions

Disaster Recovery

What is a disaster? Avoiding data loss What can cause an outage or data loss? RBD mirroring

The journal The rbd-mirror daemon Configuring RBD mirroring Performing RBD failover

RBD recovery

Filestore BlueStore RBD assembly – filestore RBD assembly – BlueStore Confirmation of recovery

RGW Multisite CephFS recovery

Creating the disaster CephFS metadata recovery

Lost objects and inactive PGs Recovering from a complete monitor failure Using the Ceph object-store tool Investigating asserts

Example assert

Summary Questions

Assessments

Chapter 1, Planning for Ceph Chapter 2, Deploying Ceph with Containers Chapter 3, BlueStore Chapter 4, Ceph and Non-Native Protocols Chapter 5, RADOS Pools and Client Access Chapter 6, Developing with Librados Chapter 7, Distributed Computation with Ceph RADOS Classes Chapter 8, Monitoring Ceph Chapter 9, Tuning Ceph Chapter 10, Tiering with Ceph Chapter 11, Troubleshooting Chapter 12, Disaster Recovery

Other Books You May Enjoy

Leave a review - let other readers know what you think

← Prev
Back
Next →

← Prev
Back
Next →