Mastering Redis - Jeremy Nelson - E-Book

Mastering Redis E-Book

Jeremy Nelson

0,0
47,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

Take your knowledge of Redis to the next level to build enthralling applications with ease

About This Book

  • Detailed explanation on Data structure server with powerful strings, lists, sets, sorted-sets, and hashes
  • Learn to Scale your data with Redis Cluster's distributed setup
  • This is a fast paced practical guide full of screenshots and real work examples to help you get to grips with Redis in no time.

Who This Book Is For

If you are a software developer with some experience with Redis and would now like to elevate your Redis knowledge and skills even further, then this book is for you.

What You Will Learn

  • Choose the right Redis data structure for your problem
  • Understand Redis event-loop and implement your own custom C commands
  • Solve complex workflows with Redis server-side scripting with Lua
  • Configure your Redis instance for optimal memory management
  • Scale your data in a distributed manner with Redis Cluster
  • Improve the stability of your Redis solution using Redis Sentinel
  • Complement your existing database and NoSQL environment with Redis
  • Exploit a wide range of features provided by Redis to become a DevOps expert.

In Detail

Redis is the most popular, open-source, key value data structure server that provides a wide range of capabilities on which multiple platforms can be be built. Its fast and flexible data structures give your existing applications an edge in the development environment.

This book is a practical guide which aims to help you deep dive into the world of Redis data structure to exploit its excellent features. We start our journey by understanding the need of Redis in brief, followed by an explanation of Advanced key management. Next, you will learn about design patterns, best practices for using Redis in DevOps environment and Docker containerization paradigm in detail. After this, you will understand the concept of scaling with Redis cluster and Redis Sentinel , followed by a through explanation of incorporating Redis with NoSQL technologies such as Elasticsearch and MongoDB. At the end of this section, you will be able to develop competent applications using these technologies. You will then explore the message queuing and task management features of Redis and will be able to implement them in your applications. Finally, you will learn how Redis can be used to build real-time data analytic dashboards, for different disparate data streams.

Style and approach

This is a hands on guide full of easy-to-follow examples, that illustrate important concepts and techniques to solve complex problems with Redis.

Sie lesen das E-Book in den Legimi-Apps auf:

Android
iOS
von Legimi
zertifizierten E-Readern

Seitenzahl: 455

Veröffentlichungsjahr: 2016

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

Mastering Redis
Credits
About the Author
About the Reviewers
www.PacktPub.com
eBooks, discount offers, and more
Why subscribe?
Preface
The philosophy behind Redis
What this book covers
Earn your Mastering Redis Open Badge
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Why Redis?
Is Redis right for me?
Experimenting with Redis
Popular usage patterns
Redis isn't right because …try again soon!
Summary
2. Advanced Key Management and Data Structures
Redis keys
Redis key schema
Key delimiters and naming conventions
Manually creating a Redis schema
Deconstructing a Redis object mapper
Key expiration
Key cautions
Big O notation
Computing big O notation for custom code
Reviewing the time complexity of Redis data structures
Strings
Hashes
Lists
Sets
Sorted sets
Advanced sorted set operations
Bitstrings and bit operations
HyperLogLogs
Summary
3. Managing RAM – Tips and Techniques for Redis Memory Management
Configuring Redis
Master-slave
32-bit Redis
About the INFO memory
Key expiration
LRU key evictions
Creating memory efficient Redis data structures
Small aggregate hashes, lists, sets, and sorted sets
Bits, bytes, and Redis strings as random access arrays
Optimizing hashes for efficient storage
Hardware and network latencies
Operating system tips
Summary
4. Programming Redis Part One – Redis Core, Clients, and Languages
Redis internals
Understanding redis.h and redis.c
Getting ready for Redis development with Git
Exercise – creating your own redis command
Redis Serialization Protocol (RESP)
Pipelining
Redis RDB format
Coroutines using Redis and Python
Todo list application using Node.js and Redis
Replication and public access
Summary
5. Programming Redis Part Two – Lua Scripting, Administration, and DevOps
The use of Lua in Redis
Using KEYS and ARGV with Redis
Advanced Lua scripting with Redis
MARC21 ingestion
Online Storefront Paper Stationery
Interoperability using JSON-LD, Lua, and Redis
Redis Lua Debugger
Programming Redis administration topics
Master-Slave replication
Transactions with MULTI and EXEC
Redis role in DevOps
Summary
6. Scaling with Redis Cluster and Sentinel
Approaches to partitioning data
Range partitioning
List partitioning
Hash partitioning
Composite partitioning
Key hash tags
Clustering Redis with Twemproxy
Testing Twemproxy with Linked Data Fragments server
Redis Cluster background
Overview of running Redis Cluster
Using Redis Cluster
Live reconfiguration and resharding Redis cluster
Failover
Replacing or upgrading nodes in Redis Cluster
Monitoring with Redis Sentinel
Sentinel for Area Code List Partition
Summary
7. Redis and Complementary NoSQL Technologies
The proliferation of NoSQL
Redis as an analytics complement to MongoDB
Redis as a preprocessor complement to ElasticSearch
Using Redis and ElasticSearch in BIBCAT
ElasticSearch, Logstash, and Redis
Redis as a smart cache complement to Fedora Commons
Summary
8. Docker Containers and Cloud Deployments
Linux containers
Docker basics with Redis
Layers in Docker images
Docker filesystem backends
Building images with a Dockerfile
Hosting and publishing Docker images
Docker and Redis issues
Packaging your application with Docker Compose
Redis and AWS
Dedicated cloud hosting options
Redis Labs
DigitalOcean Redis
Summary
9. Task Management and Messaging Queuing
Overview of Redis Pub/Sub
Pub/Sub RESP replies
SUBSCRIBE and UNSUBSCRIBE RESP Arrays
PSUBSCRIBE and UNSUBSCRIBE arrays
Pub/Sub with Redis CLI
Redis Pub/Sub in action
First workstation using Python Pub/Sub
Second workstation Node.js Pub/Sub
Third workstation Lua Client Pub/Sub
Redis keyspace notifications
Task management with Redis and Celery
GIS and RestMQ
Adding task management with RestMQ
Messaging with Redis technologies
Messaging with Disque
Summary
10. Measuring and Managing Information Streams
Extracting, transforming, and loading information with Redis
Extracting JSON to transform into RESP
Security considerations when managing Redis
Redis protected mode
Command obfuscation
Operational monitoring with a Redis web dashboard
Machine learning and Redis
Naïve Bayes and work classification
Creating training and testing datasets
Extracting word Tokens from BIBFRAME Works
Applying Naïve Bayes
Linear regression with Redis
Summary
A. Sources
Chapter 1: Why Redis?
Chapter 2: Advanced Key Management and Data Structures
Chapter 3: Managing RAM – Tips and Techniques for Redis Memory Management
Chapter 6: Scaling with Redis Cluster and Sentinel
Chapter 7: Redis and Complementary NoSQL Technologies
Chapter 10: Measuring and Managing Information Streams
Index

Mastering Redis

Mastering Redis

Copyright © 2016 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: May 2016

Production reference: 1260516

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78398-818-1

www.packtpub.com

Credits

Author

Jeremy Nelson

Reviewers

Emilien Kenler

Saurabh Minni

Commissioning Editor

Kunal Parikh

Acquisition Editor

Harsha Bharwani

Content Development Editors

Kirti Patil

Mayur Pawanikar

Technical Editors

Utkarsha Kadam

Tanmayee Patil

Copy Editor

Merilyn Pereira

Project Coordinator

Nidhi Joshi

Proofreader

Safis Editing

Indexer

Rekha Nair

Graphics

Abhinash Sahu

Production Coordinator

Aparna Bhagat

Cover Work

Aparna Bhagat

About the Author

Jeremy Nelson is the metadata and systems librarian at Colorado College, a 4-year private liberal arts college in Colorado Springs. In addition to working 8 hours a week on the library's research helpdesk, providing information literacy instructions to undergraduates, and supervising the library's systems and cataloguing departments, Nelson is actively researching and developing various components and open source tools in the Catalog Pull Platform for use by Colorado College, the Colorado Alliance of Research Libraries Consortium, and the Library of Congress. He is also co-founder and CTO of KnowledgeLinks.io, a semantic web startup.

His previous library experience includes jobs at Western State Colorado University and the University of Utah. Prior to becoming a librarian, he worked as programmer and project manager at various software companies and financial services institutions. His first book, Becoming a Lean Library, published in 2015, applies lean startup and lean manufacturing ideas to libraries and library operations. Nelson's undergraduate degree is from Knox College and his master's of science in library and information science is from the University of Illinois Urbana-Champaign.

About the Reviewers

Emilien Kenler, after working on small web projects, began focusing on game development in 2008 while he was in high school. Until 2011, he worked for different groups and specialized in system administration.

In 2011, he founded a company that sold Minecraft servers while studying computer science engineering. He created a lightweight IaaS (https://github.com/HostYourCreeper/) based on new technologies such as Node.js and RabbitMQ.

Thereafter, he worked at TaDaweb as a system administrator, building its infrastructure and creating tools to manage deployments and monitoring.

In 2014, he began a new adventure at Wizcorp, Tokyo. The same year, Emilien graduated from the University of Technology of Compiègne.

Emilien has written MariaDB Essentials for Packt Publishing. He has also contributed as a reviewer on Learning Nagios 4, MariaDB High Performance, OpenVZ Essentials, Vagrant Virtual Development Environment Cookbook, and Getting Started with MariaDB - Second Edition, all books by Packt Publishing.

Saurabh Minni has an engineering degree with specialization in computer science. A polyglot programmer with over 10 years of experience, he has worked in a variety of technologies, including Assembly, C, C++, Java, Delphi, JavaScript, Android, iOS, PHP, Python, ZMQ, Redis, Mongo, Kyoto Tycoon, Cocoa, Carbon, Apache Kafka, Apache Storm, and ElasticSearch. In short, he is a programmer at heart and loves learning new tech-related things each day.

Currently, he is working as technical architect at Near (an amazing start-up building a location intelligence platform). Apart from handling several projects, he was also responsible for deploying an Apache Kafka cluster. This was instrumental in streamlining the consumption of data in big data processing systems such as Apache Storm, Hadoop, and so on at Near.

Saurabh is also the author of a book on Apache Kafka, Apache Kafka Cookbook, Packt Publishing.

He has also been a reviewer on the book Learning Apache Kafka, Packt Publishing.

He is reachable on Twitter at @the100rabh and on GitHub at https://github.com/the100rabh/.

This book would not have been possible without the continuous support of my parents, Suresh and Sarla, and my wife, Puja. Thank you for always being there.

www.PacktPub.com

eBooks, discount offers, and more

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Preface

The intention of Mastering Redis is to build upon your basic knowledge of Redis through two ways; provide the deeper meaning of the context and theory behind Redis and its technologies, and increase your practical day-to-day skills with Redis. The Mastering in this book's title implies an ongoing process and not an end destination. What is exciting about Redis is its ongoing and public evolution into the powerful data manipulation and storage technology of today.

The philosophy behind Redis

Salvatore Sanfilippo has, over the lifespan of the project, articulated a distinct view and opinion about the direction and functionality of Redis. In a January 2015 blog post about benchmarking Redis against other databases, Sanfilippo states "I don't want to convince developers to adopt Redis. We just do our best in order to provide a suitable product, and we are happy if people can get work done with it. That's where my marketing wishes end." Sanfilippo and a small core group of Redis developers follow the successful open source governance model of the "benevolent dictator for life" (BDL), where a single person is the ultimate arbitrator of what is committed into the Redis code base. The success of the BDL model, evidenced by open source projects such as Linux kernel development and the Python programming language, is replicated in Redis with Sanfilippo as its primary developer and maintainer.

The BDL model failure modes can be catastrophic if the dictator abandons the project, or worse, is incapacitated through illness or death. Another significant problem that has emerged particularly with Redis is when potential contributors submit pull requests and action on their pull requests is delayed, or more often, ignored. To be fair, the volume of changes that must be examined, tested, and merged into the main code base can be substantial and requires a passionate and dedicated gatekeeper. Linus Torvalds, the initial creator and current BDL for the Linux kernel project, has seen his role evolve more into merging code contributed by others and providing vision and leadership for Linux than writing code himself. Sanfilippo, while acknowledging this problem in a thread on the main Redis e-mail distribution, gives two main reasons for continuing with the current BDL model for Redis:

A consistent vision for the project's development and future directionsAccountability for any new or merged changes

Sanfilippo's vision of Redis, as an easy-to-configure, small-memory-footprint (for itself and NOT for its datasets!) and reliable key-value data store has been crucial to the continued rise in Redis's popularity among developers and organizations. His vision does cause tension, especially when new features for Redis are proposed, such as expiring specific sub-values in a hash or offering loadable modules for optional functionality, and these features are rejected for inclusion into Redis. Sanfilippo's desire to keep Redis small and focused on being a memory-only database drives his decisions and development practice.

In 2011 blog post, he elucidated his vision for Redis in a seven-point manifesto for Redis and the Redis development process. Briefly, here are the seven points:

A DSL for Abstract Data Types: Redis is a Domain-Specific Language (DSL) for representing and using abstract data structures. These data structures include both the operations (Redis commands) as well as the memory efficiency and time complexity of storing and manipulating those data structures with the associated Redis commands.Memory storage is the #1: By storing all of the data in a computer's RAM, Redis's performance across different systems is more consistent, the various algorithms used to implement these data structures run in a more predictable fashion, and more complex data types such as sorted sets are easier to implement in an in-memory database.Fundamental data structures for a fundamental API: Redis implements a fundamental API for its fundamental data structures. This API, made up of Redis commands and corresponding data structures, tries to intelligibly resemble the data structures the API reads and writes to the computer's memory. Following this design, the Redis API builds more complex operations into the API by building from simpler operations on data structures in the API.Code is like a poem: The most elusive of the seven points in this manifesto. Sanfilippo gives his aesthetic preference for code that fits into a larger narrative of the entire Redis project. His point is that Redis's coding style and approach are geared for humans to construct a narrative. So, inclusion of third-party code depends in part on how well the code fits into the large narrative of Redis and Redis's source code.We're against complexity: Complexity in code is to be avoided. Given a choice to build a small feature with a lot of implementing of code or to forgo the functionality, Redis will take the latter route and forgo the extra complexity and overhead of adding complexity to the code base.Two levels of API: Redis starts with a subset of its API to run in a distributed manner and a larger, more functionality-rich API to support multikey operations. This separation allows significant features such as the Redis master-slave and Redis cluster modes of operation.We optimize for joy: An emotional appeal and very intelligent statement, for developers and operators of technology in general, the thrill of tuning technology to solve difficult and complex problems does elicit feelings of happiness and excitement about the future possibilities of Redis.

What this book covers

As you read Mastering Redis, two themes will emerge that parallel the development/operations dualism of the popular and trendy operations and processes, commonly known as DevOps. To help guide your approach to the material contained in the chapters, each chapter's topics will be identified as either software development or system operations focused. Due to the increasingly blurred line between the two, getting a topical understanding of the topics in each trend increases your and your team's abilities to quickly and efficiently develop and deploy Redis solutions for your project or as a piece of your technological infrastructure requirements.

In the following diagram, each chapter's horizontal position visually represents whether the topics weigh towards software development or systems operations:

DevOps Chapter Tracks

Chapter 1, Why Redis?, introduces the Redis development philosophy as articulated by Salvatore Sanfilippo, the founder and primary maintainer of Redis.

Chapter 2, Advanced Key Management and Data Structures, builds upon your basic knowledge of Redis by expanding and explaining Redis data structures and key management, including the important topic of constructing meaningful and expressive key schemas for your applications.

Chapter 3, Managing RAM – Tips and Techniques for Redis Memory Management, looks at the various options Redis provides to optimize the memory usage in your applications including Redis support for various caching and key eviction strategies based on Less Recently Used (LRU) implementations in Redis.

Chapter 4, Programming Redis Part One – Redis Core, Clients, and Languages, is an advanced topic on programming applications. This chapter starts with an overview of Redis's core C programming language implementation and includes an in-depth examination of selected C code snippets to deepen your knowledge of Redis. It continues with how to use three different Redis clients, with short programming exercises in Python, Node.js, and Haskell.

Chapter 5, Programming Redis Part Two – Lua Scripting, Administration, and DevOps, is an advanced topic on programming applications. It starts with an overview of Redis server-side Lua scripting and how to use Lua more effectively with Redis. The chapter next expands on a few popular programming design patterns with Redis, with specific examples of how different people and companies have used these patterns in their operations. This chapter ends with how Redis is used in typical DevOps scenarios from the perspective of a software developer.

Chapter 6, Scaling with Redis Cluster and Sentinel, explores two relatively recent additions to Redis—Redis Cluster and Redis Sentinel. Redis Sentinel is a special high-availability mode for monitoring the health of masters and slaves, along with the ability to switch if a failure occurs in any master or slave Redis instance. Redis Cluster, mentioned previously, is now a production-ready way to store large amounts of data that may be too big to fit into the memory of a single machine, by running multiple Redis instances through key sharding. While these topics have more of an operational focus, engineering solutions with Redis should, at the minimum, know the benefits and limitations of how to use Redis Cluster.

Chapter 7, Redis and Complementary NoSQL Technologies, starts with the recognition that for most organizations, their information technology stack includes a heterogeneous mixture of different types of data and processing solutions. Redis is an ideal way to extend the functionality of other NoSQL data storages options, and in this chapter, we'll see how Redis can be used with MongoDB, ElasticSearch, and Fedora Digital Repository. This chapter should be of interest to both developers and system administrators who may need to develop and support complex business requirements with multiple solutions.

Chapter 8, Docker Containers and Cloud Deployments, shows how using Redis as in Docker containers and images can simplify management and improve security and reliability of your Redis solutions. Docker is an open source container technology for applications that is rapidly being adopted by many enterprises. Building upon Docker with Redis, we'll then examine specific challenges of using Redis on the most popular computing cloud providers starting the largest and most established, Amazon Web Services, followed by Google's Compute Engine and Microsoft Azure, with special attention to other cloud service providers such as Rackspace and Digital Ocean. We'll finish the chapter by examining Redis's offerings of specialized cloud services that focus on hosting and managing your Redis instances.

Chapter 9, Task Management and Messaging Queuing, begins with an in-depth exploration of Redis Pub/Sub commands. This involves first looking at various examples of how publishers and consumers can communicate between different processes, programs, Redis clients, operating systems, and remote computers. Further in the chapter, we'll expand upon Redis Pub/Sub and look more generally at using Redis as a messaging queue between different layers in an enterprise computing ecosystem. This chapter ends by wrapping up all the concepts through a detailed example of using Redis with Celery as task management and a messaging queue with Pub/Sub support.

Chapter 10, Measuring and Managing Information Streams, builds upon the previous chapter's concepts to show how Redis is be used as a real-time data aggregator for disparate data streams of various technology systems used within an organization. We'll then examine the Redis security model and new security features with the latest version Redis. A web-based, operational dashboard will visualize the incoming data flows into Redis using our knowledge of Redis clients. Next, we'll show how to apply machine learning algorithms, such as Naive Bayes, to these Redis-based information flows to provide a richer snapshot and deepen your understanding of the operations occurring within an organization or department.

Appendix, Sources, acknowledges the source of extracts used in the chapters and presents links chapter-wise for further reading.

Earn your Mastering Redis Open Badge

The Mozilla Foundation—the same open source organization that sponsors the development of the Firefox web browser—started a project called Open Badges that allows organizations to create and then issue portable and non-proprietary badges to individuals to signal accomplishments:

At the Mastering Redis website, you have the opportunity to signal to your current and potential employers your increased knowledge and skills with Redis by taking a series of online quizzes and earning your Mastering Redis Open Badge. Your Open Badge can be shared through popular social networking sites such as Facebook, Twitter, or LinkedIn.

The Mastering Redis Open Badge is free to readers who have purchased the book. However, for readers who don't own a copy, you can still earn your Open Badge at the book's website for a nominal fee. The opportunity to connect with other badge earners, learning from their experiences with Redis while sharing your own stories and knowledge and thus encourages learning long after you have finished reading Mastering Redis. Our hope is that this book can immediately help your understanding of Redis and that by earning your Open Badge, you can document this professional achievement.

What you need for this book

Redis is intended to be run under a POSIX-based environment such as Linux or Mac OX with a modern C++ compiler. Microsoft Windows versions of Redis are available but not officially supported. Please see the Windows section at http://redis.io/download for more information. Examples in this book also use Python 3.5 with the Redis Python client (https://github.com/andymccurdy/redis-py), Lua, and Node.js with the Redis Node.js client (https://github.com/NodeRedis/node_redis).

Who this book is for

If you are a web developer with a basic understanding of the MEAN stack, experience in developing applications with JavaScript, and basic experience with NoSQL databases, then this book is for you.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.Hover the mouse pointer on the SUPPORT tab at the top.Click on Code Downloads & Errata.Enter the name of the book in the Search box.Select the book for which you're looking to download the code files.Choose from the drop-down menu where you purchased this book from.Click on Code Download.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for WindowsZipeg / iZip / UnRarX for Mac7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Mastering-Redis. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/MasteringRedis_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Chapter 1. Why Redis?

Why Redis? Or, why any technology? Such questions are often mumbled under the breath or asked by the more brave, cynical, or knowledgeable when encountering any new technology or service. Sometimes, the answer is obvious, the technology or service offers features and functionalities that meet an immediate need or solves a vexing problem. In most situations, the reasons for adopting a technology may not be as clear-cut or as apparent or are cloaked in sometimes hyperbolic or indecipherable marketing jargon. Depending on your needs, Redis falls somewhere closer to the obvious end of the spectrum instead of a marketing sales pitch. You may already know and have used Redis for some uses, such as meeting a data storage need or service requirement for an application, but you may not be aware of all that Redis can do or how other people are using Redis in their own organizations. Redis, best known for its speed, is not only fast in its execution but also fast in the sense that solutions built with Redis have fast iterations because of the ease in configuring, setting up, running, and using Redis.

The growing popularity of Redis, an open source key-value NoSQL technology, is a result of Redis's stability, power, and flexibility in executing a wide range of data operations and tasks in the enterprise, REmote DIctionary Server (Redis), is used by a diverse set of companies from start-ups to the largest technology companies such as Twitter and Uber, as well as by individuals and teams in government, schools, and organizations. We'll start this chapter with a short survey of a few popular design patterns for Redis and then, provide practical advice on determining whether Redis is the right choice for you.

We'll then go through a detailed example of how Redis a legacy metadata format used by public and academic libraries – including some museums – to illustrate Redis's flexibility and power with just three data structures and an intentional key design. Finishing this chapter off, we'll touch upon recently added functionalities and commands to Redis.

Is Redis right for me?

A relatively common question posted to the general Redis e-mail mailing list, asks whether Redis is a good choice for a variety of uses, such as running reviews on a website, caching results from MySQL databases queries, or meeting other specific requirements that the poster might have for his/her project, product, website, or system. In general, Redis excels as a tool for a fast read/write of data and has been used with great success by small and large organizations alike for a wide range of uses. Salivator Sanfilippo makes a strong case that Redis does not need to replace the existing databases but is an excellent addition to an enterprise for new functionalities or to solve sometimes intractable problems1.

Being a single-threaded application with a small memory footprint, Redis achieves durability and scalability through running multiple instances on the current multicore processors available in data centers and cloud providers. With Redis-rich master-slave replication and now with Redis clusters are released in production, creating multiple Redis instances are relatively cheap operation in terms of memory and CPU requirements, allowing you to both scale and increase the durability of your larger applications.

Redis allows you to conceptualize and approach challenging data analysis and data manipulation problems in a very different manner as compared to a typical relational data model. In an SQL-based relational database, the developer or database administration creates a database schema that organizes the solution domain through normalizing the data into columns, rows, and tables with connecting joins through foreign-key relationships.

Even other NoSQL data storage technologies such as MongoDB or Elasticsearch require the data to be modeled as JSON document data structures first before being loaded into the actual storage. Redis skips this intermediate but necessary step in these other technologies, by just providing sets of commands for specific data structures such as strings, lists, hashes, sets, and sorted sets. In this approach, you are algorithmically interacting with your data, constructing solutions directly with how the data is stored in Redis and the available commands, and enabling a more direct tuning and monitoring of the underlying operating system's memory and hard disk space.

Thinking how data is represented and managed as basic computing data structures such as lists, hashes, and sets, allows you to grasp both positive and negative characteristics of the data and its structures in a more fundamental, mathematical fashion. Going through the intermediate structuring process such as normalizing your data for a relational database or converting it into a JSON document for MongoDB or Elasticsearch, while valuable, imposes a structure that Redis does not. As you architect your solutions, you may discover that your data and your problem need more of the persistence and structure of a technology other than Redis, but in the meantime, your exploration of the properties and the structure of data in Redis will be a useful exercise because of this algorithmic approach to your information and problem.

Redis may not be the best technology to use when you have a large amount of infrequently used data that does not require immediate access. An SQL-based relational database or a document-store NoSQL technology such as CouchDB or MongoDB may be a better choice than Redis. However, with Redis Cluster now fully supported as of version 3, large datasets can be sharded and used in Redis as a distributed key-value data store. As more organizations and individuals gain experience with the use of Redis Cluster, expect that this reason to not choose Redis for a project will fade away.

Experimenting with Redis

Redis's rich set of data types allows for easy and fast experimentation of data-based algorithms and approaches on information. In my own experience with Redis, this ability to quickly model and use solutions is based on the characteristics of the different data structures of Redis and the flexibility in defining the structure and syntax of the keys. I was impressed and excited to be able to name a chunk of malleable data and to relate this name with other keys through the naming semantics of the key. This is a great feature of Redis that is sometimes underappreciated as to how powerful and useful a tool it can be in developing and understanding your data.

I first started experimenting with Redis in 2011 as a metadata and systems librarian at Colorado College at the base of the Pikes Peak Mountain in Colorado. Most libraries around the world store and structure their bibliographic data in a somewhat surprisingly durable binary format called, MAachine-Readable Cataloging (MARC), substantially developed in the late 1960s by Henriette Avram of the United States Library of Congress. The current version, MARC 21, is officially supported by the Library of Congress (however, it is in the process of replacing MARC with a new RDF-based linked data vocabulary called BIBFRAME). MARC21 initially encoded information about the books on the library's shelves and has been extended to support e-books available for checkout; video, music, and audio formats; physical formats such as CDs, Blu-ray discs, and online streaming formats; and academic libraries. In fact, an increasingly large percentage of its budget is devoted to the purchase of journal articles through online publishers and electronic-content vendors.

The MARC format is made up of both fixed length and variable-length fields numbered in the three-digit range of 001–999, which in turn can have either character data or subfields with data. In addition, each field can have up to two indicators that modify the meaning of the field. Two of the most common and important MARC fields are the 100 Main Entry – Personal Name field and the 245 Title Statement field. Here is an example from David Foster Wallace's book Infinite Jest:

=100 1\$aWallace, David Foster=245 10$aInfinite jest :$ba novel$cDavid Foster Wallace

To use this MARC data in Redis, each MARC record was a hash key modeled as marc:{counter} with the counter being a global incremental counter. Each MARC field is a hash with the key modeled as marc:{counter}:{field}. As some MARC fields are repeatable with different information, the hash key would include a global counter such as marc:{counter}:{field}:{field-counter}. Simply storing these two fields would result in the following six Redis commands:

127.0.0.1> INCR marc(integer 1)127.0.0.1:6379> INCR marc:1:100 (integer 1)127.0.0.1> HSET marc:1:100:1 a "Wallace, David Foster"OK127.0.0.1:6379> INCR marc:1:245 (integer) 1127.0.0.1:6379> HMSET marc:1:245:1 a "Infinite jest :" b "a novel" c "David Foster Wallace"OK127.0.0.1:6379> HGETALL marc:1:245:11) "a"2) "Infinite jest :"3) "b"4) "a novel"5) "c"6) "David Foster Wallace"

This key structure in Redis looks like the following:

MARC in Redis

The storage of MARC data in Redis can be accomplished with just a single Redis data type, a hash, along with a consistent key syntax structure. To improve the usability of this bibliographic data in Redis and to realize a very common use case of retrieving library data as a list of records sorted alphanumerically by title and author name (in library parlance two access points) is also accomplishable with other Redis data types such as lists or sorted sets.

Representing MARC fields and subfields in Redis by using hashes and lists was informative. Further, I wanted to see if Redis could handle other types of book and material metadata models that were being put forward as replacements for MARC. The Functionality Requirements for Bibliographic Record, or FRBR, was a document that put forward an alternative to MARC and was based onentity-relationship (ER) models. The FRBR ER model contained groups of properties that were categorized according to abstraction. The most abstract is the Work class, which represents the most general properties to uniquely identify a creative artifact with such information as titles, authors, and subjects.

The Expression class is made of properties such as edition and translations with a defined relationship to the parent Work. Manifestations and Items are the final two FRBR classes, capturing more specific data where Item is a physical object that is a specific instance of a more general Manifestation.

With few actual systems or technologies that implement an FRBR model for library data, Redis offers a way to test such a model with actual data. Using existing mappings of MARC data to FRBR's Work, Expression, Manifestation, and Item, the MARC 100 and 245 fields from the above would be mapped to an FRBR Work in Redis as shown by these examples of using the Redis command-line tool, redis-cli, to connect to a Redis instance:

127.0.0.1:6379> HMSET frbr:work:1 title "Infinite Jest" "created by" "David Foster Wallace"OK

This new work, frbr:work:1 can be associated with the remaining classes with the following Redis keys and hashes:

127.0.0.1:6379> HMSET frbr:expression:1 date 1996 "realization of" frbr:work:1OK127.0.0.1:6379> HMSET frbr:manifestation:1 publisher "Little, Brown and Company" "physical embodiment of" frbr:expression:1OK127.0.0.1:6379> HMSET frbr:item:1 'exemplar of' frbr:manifestation:1 identifier 33027005910579OK

In the previous example for Expression, a specific date is captured along with a relationship back to frbr:work:1 through the realization of a property. Similarly, the frbr:manifestation:1 hash has two fields; a publisher, and the physical embodiment of. The physical embodiment of field's value is the frbr:expression:1 key that links the Manifestation back to the Expression. Finally the frbr:item:1 hash has a barcode identifier property and a relationship key back to the frbr:manifestation:1 hash.

In both the MARC and FRBR experiments, the Redis hash data structure provided the base representation for the entity. This strategy starts to fail when there can be more than one value for a specific property, such as when representing multiple authors of a work. The first attempt to solve this problem for those properties with multiple values is by creating a counter for each MARC field as outlined above. For example, the MARC 856 field – Electronic Location and Access – stores the URL for e-books or other material that has a network-resolvable URL. If we want to add two URLs to the preceding MARC example, such as a link to the book in Google Books and a wiki on the book, the Redis commands would be as follows:

127.0.0.1:6379> INCR global:marc:1:856(integer) 1127.0.0.1:6379> HMSET marc:1:856:1 ind1 4 ind2 1 u https://books.google.com/books?id=Nhe2yvx6hP8COK127.0.0.1:6379> HMSET marc:1:856:2 ind1 4 ind2 2 u http://infinitejest.wallacewiki.com/OK

This naming approach for the MARC keys meets the requirement for repeating MARC fields, but how can we support the edge case wherein a single MARC field has multiple, repeating subfields? The first pass to solve this problem may be to store a string with some delimiter between each subfield as the value for a particular filed in the MARC. This would require additional parsing on the client side to extract all the different subfields, and we would lose any additional advantages that Redis may provide if these multiple subfields were stored directly in Redis. The second approach to solving the MARC field with multiple subfields in a MARC field would be to further expand the Redis key syntax and use a list or some other data structure as value for each subfield key. Expanding the MARC 856 example, if we wanted to add a second e-book URL, maybe a URL to the Amazon Kindle version, it would look like the following in Redis:

127.0.0.1:6379> LPUSH marc:1:856:1:u https://books.google.com/books?id=Nhe2yvx6hP8C http://www.amazon.com/Infinite-Jest-David-Foster-Wallace/(integer) 2127.0.0.1:6379> HSET marc:1:856:1 u marc:1:856:1:u(integer) 0

Storing multiple subfields in a Redis list works well, but what if I don't want any duplicate values in a MARC field's subfields? This can be easily solved by the use of Redis's set data type, which, by definition, only contains unique values. The use of sets for the subfield values seems like a good solution, but it fails, if we need to keep the ordering of the values in the subfield.

Fortunately, Redis's sorted set data type fits our use case admirably by ensuring a collection of unique subfield values with no duplications, and finally maintaining, the subfield ordering. The resulting Redis commands for storing the URLs of a book in the MARC 856 field would look the following:

127.0.0.1:6379> DEL marc:1:856:1:u(integer) 1127.0.0.1:6379> ZADD marc:1:856:1:u 1https://books.google.com/books?id=Nhe2yvx6hP8C 2http://www.amazon.com/Infinite-Jest-David-Foster-Wallace/(integer) 2127.0.0.1:6379> ZRANGE marc:1:856:1:u 0 -1 WITHSCORES1) "https://books.google.com/books?id=Nhe2yvx6hP8C"2) "1"3) "http://www.amazon.com/Infinite-Jest-David-Foster-Wallace/"4) "2"

In this example, we examined how to represent a legacy format for library data called MARC, and how MARC's fields and subfields data can be stored in Redis by using hashes, and how the storing of subfields changes as more requirements are met, moving from storing subfields first as Redis lists, followed by sets, and finally finishing by using the sorted set data type. This iterative experimentation hopefully illustrates an important reason for using Redis, namely the ability to quickly test out different methods of storing data and how the characteristics of different Redis data types such as hashes, lists, sets, and sorted sets can be used to represent both the data and some of the requirements for storing and accessing this data.

Popular usage patterns

A very popular use pattern for Redis is as an in-memory cache for web applications. Redis is available as a caching option for popular web frameworks such as Django, Ruby-on-Rails, Node.js, and Flask. As a popular caching technology Redis excels in web applications for storing new data while evicting stale data. For web applications, the cached data can range from single HTML character strings, widgets, and elements to entire web pages and websites.

By utilizing Redis's ability to set an expiration time on a key, one of Redis' popular caching strategies called Less Recently Used (LRU) is robust enough to handle even the largest web properties, with the most popular content remaining in cache but stale and little-used data being evicted from the data store. This caching use case doesn't assume that the original web element or page is generated from the data in Redis; most likely, the web content was dynamically generated from other sources of data with Redis, in this use pattern, and operates as an excellent web caching layer in this setup.

The second popular use pattern for Redis is for the metric storage of such quantitative data such as web page usage and user behavior on gamer leaderboards. Using bit operations on strings, Redis very efficiently stores binary information on a particular characteristic. Usage for a website could be stored with a key constructed from a date such as page-usage:2016-11-01, which has a string attached with a bit flipped to 1 the first time a web page is accessed by a user.

The daily usage for the website for November 1 can be obtained through a simple BITCOUNT Redis command on the page-usage:2016-11-01 key. In a 2011 blog post, individuals at a start-up named Spool explain in detail how they use bitmaps and Redis bit operations to store the user activity on their website with this design pattern.

The third popular Redis use pattern is as communication layer between different systems through a publish/subscribe (pub/sub for short) model, where one can post messages to one or more channels that can be acted upon by other systems that have subscribed to or are listening to that channel for incoming messages.

Typically, publishers do not need to know the specific subscribers to send messages to them (say in a point-to-point messaging model); only the message contents and what channel to send the message should be known. Similarly, a subscriber does not need to know individual publishers, only the channel to receive messages. The pub/sub pattern is nice because it scales easily, and the publishers and subscribers can be very different programs and systems.

Redis isn't right because …try again soon!

As an active open-source project, Redis adds new functionality and improvements that may solve a problem that you or someone in your organization decided it wasn't suited for in the past. Optimizing the use of such a valuable and functional tool as Redis means understanding its recent history and keeping current with new functionality being developed and tested for inclusion in the latest stable version of Redis. Redis follows a common semantic versioning pattern of major.minor.patchlevel with a minor even number denoting a stable version and an odd minor number an unstable branch.

For example, the Redis 2.8.9 release introduced two of the more significant improvements, namely the HyperLogLog, a highly efficient data structure for a population estimate and of unique elements, and the new ZRANGEBYLEX, ZLEXCOUNT, and ZREMRANGEBYLEX commands for sorted sets. Both these are improvements that will be discussed at length in Chapter 2, Advanced Key Management and Data Structures. Redis Cluster – released for production use in early 2015 with Redis version 3.0 – is one of most important additions to the Redis ecosystem, which we will go over in much more detail in Chapter 6, Scaling with Redis Cluster and Sentinel.

For the next major release Redis added Geographic Information Systems (GIS) commands and modified sorted sets along with new Lua scripting support for Redis Cluster and a new Lua debugger in Redis version 3.2. To visualize the rate of change to the Redis code base, the following graphic shows the rate of change in the Redis code base during the Redis 2.x series to Redis version 3.0.

Be aware of the dynamic nature of Redis development when asking yourself, why Redis? The limitations that you thought Redis had might no longer be the case and as you continue to grow your knowledge and improve your skills in mastering Redis, keeping up with Redis changes should a critical priority as you improve your existing technology and build new and exciting opportunities for the future.

Summary

The decision as to whether Redis is the correct choice for a new project or to solve a data problem you might be experiencing really depends on the nature of your data and what you're trying to accomplish with your project. Redis, unlike relational databases or NoSQL document stores, does not require you to structure your data first before using it. Redis provides a direct, more algorithmic manipulation of your data through the use of a variety of data structures such as lists, hashes, sets, and sorted sets. Even if Redis is not your final choice, the exercise of breaking down your data into these data structures will help deepen the context and the analysis of the issue that you're trying to solve. A detailed example of such experimentation was given while representing a legacy library standard called MARC in the basic Redis hashes, lists, sets, and sorted sets. We then briefly reviewed three popular design patterns for using Redis as a web cache, Redis as the backend for a gamer leaderboard, and Redis used as a publish/subscribe messaging system. We finish this chapter by illustrating some recent changes to Redis that expand the types of problems that Redis can be the primary data solution that in the past traditional SQL database or other NoSQL technologies may have been adopted instead.

In the next chapter, we are going to first examine Redis keys and the importance of organizing these keys with a Redis key schema generated either through a Redis object mapper or through manual documentation. Chapter 2 then introduces the Big O notation, followed by a systematic review of the basic Redis data structures and commands based on time complexity measures, Chapter 2 finishes with an introduction to some of the newer data structures and commands, including bitstrings and HyperLogLog.

Chapter 2. Advanced Key Management and Data Structures

Using Redis as data storage in your application starts by considering two sides of the solution: the keys and the data structures used as the key values in Redis. Coming up with a good Redis key schema, syntax, and naming convention can mean the difference between an effective and sustainable solution and a technological mess. Because of the flexibility that Redis gives you by allowing most string serialization as keys, much more intentional thought and design should be given to this important step in designing a Redis-based project. Likewise, using an appropriate data structure for any particular key also directly impacts the usability and functionality of any application built with Redis. This chapter covers the following:

Designing and managing a Redis key schema and the associated data structuresUsing Redis client object mappers that use different strategies that hide the specific key schemas and data structuresCreating a simple application using a Javascript Redis object mapper and analyzing how the object mapper uses Redis commands and data structures as an example of a Redis key schemaIntroducing the Big O notation and how this measure of worst-case algorithmic effectiveness at scale is used in evaluating the performance of Redis's commands and how this performance directly relates to Redis's underlying data structures

This focus on the Big O notation in Redis's official documentation provides a method of estimating the time complexity of an application's use of Redis and helps in evaluating your Redis-based application's performance. Together, the Redis key and values should complement and reinforce the solution, while balancing the memory efficiencies of smaller-length keys with enough verbosity for explaining the purpose of the keys to the application designer, developer, or end user.

Redis keys

Effectively, using Redis in your application involves understanding how Redis stores keys and the operations to manipulate the key space within a Redis instance. Running a 32-bit or 64-bit version of Redis dictates the practical limits to the size of your Redis keys. For the 32-bit Redis variant, any key name larger than 32 bits requires the key to span multiple bytes, thereby increasing the Redis memory usage. Using 64-bit Redis allows for larger key lengths but has the downside that keys with small lengths will be allocated the full 64 bits, wasting the extra bits that are not allocated to the key name.

The flexibility of Redis allows for a wide diversity in how keys are structured and stored. The performance and maintainability of Redis can be either positively or negatively impacted by the choices made in designing and constructing the Redis keys used in your database. A good general practice when designing your Redis keys is to construct at least a rough outline of what information you are trying to store in Redis and an initial idea of how the data will be stored in one of the many different Redis data structures. Finally, you'll want to diagram how your data structures relate to the other information stored in different keys in your Redis database. This process is generally lumped under the rubric of "Redis Key Schema" construction, but your Redis key schema doesn't need to be code-based, just a simple text file documenting your syntax, how your keys relate to each other, and what data structures are stored in your various keys, should be sufficient for small projects or use cases.

Redis key schema

Although the official Redis tutorial on data types1 recommends using a consistent schema when naming keys, Redis itself does not have any schema checking or validation functions although some basic validation can be done through the use of the EXISTS and TYPE Redis commands. If your application requires that a Redis key with a certain type exists in the instance, checking for the key's existence is easily accomplished with the EXISTS command followed by the subsequent TYPE command to confirm that the key is the expected Redis data structure stored in that key location. Beyond these two commands, validating the Redis key syntax and structure requires client-side code.

Adding this additional validation logic layer to your application may be useful if your Redis application is to be shared across different systems and organizations. An accurate and detailed Redis key schema can greatly assist you and the application developers and operators in troubleshooting or debugging problems. Another avenue to validate your Redis Key schema would be to include specific unit tests in your Redis application that test for boundary conditions, schema key syntax, and structure, along with the expected data structures for each validated key. The third option for validating your Redis key schema is to use a DTD or another XML-based validation of your key structure or to use a new schema validation technology such as JSON Schema available at http://json-schema.org/.

Options for validating Redis keys

A good key schema should also provide guidance for adding new Redis keys to an existing Redis-based application. There should not be any mysteries about what the name of a new Redis key should be if the schema is descriptive and consistent. Know and use both singular and plural forms of nouns to identify what and how many of an entity is being saved to Redis. For example, book:1 could be a Redis hash storing field related to a single book, while the Redis key books:sci-fiction could be a set of all books that are classified as part of the science fiction genre. A sorted set could be used for book sales ranking with the books:sales-rank key name with the number of books sold as the weight or the sorted set score and the book key as the value.

An example of a text-based Redis schema for a simple book application could look like the following:

Name