Cloud Block Store: How Pure Storage Built a Cloud-Native Block Storage Solution

One of the major announcements of Pure Storage Accelerate 2019 was Cloud Block Store. TECHunplugged partially covered Cloud Block Store here, but the thought of a more complete coverage of this solution has been in our backlog ever since September 2019.

What is Cloud Block Store

Cloud Block Store (CBS) is Pure Storage’s implementation of a block storage system that can run natively in the cloud. CBS delivers block storage capabilities to cloud workloads while maintaining the same operational consistency in terms of data management and administration capabilities.

One of the goals of Pure Storage is to enable true bi-directional hybrid cloud mobility, allowing block-based workloads to run either on private or public clouds.

“So what?” will many say. “Yet another storage vendor developing a cloud-based appliance of their on-premises storage platform, right?”

Well, that’s the interesting and exciting point.

Building a cloud-native block storage architecture

What we expected was yet another virtual appliance, and we had it all wrong. Pure Storage went as far a developing a true cloud-native block storage architecture that doesn’t runs as a virtual appliance, but instead uses cloud native constructs as its building blocks.

Furthermore, the solution also uses cloud-native data availability mechanisms to implement resiliency. It should be noted that currently CBS is available only on AWS.

CBS Architecture

Seen from the outside, CBS operates similarly to a dual-controller storage array. Each controller (CBS Controller) is impersonated by an EC2 instance, for a total of 2 CBS Controllers.

These controllers have access to a virtual drive shelf comprising a total of 7 virtual disk drives. Each virtual disk drive is in fact an independent AWS EC2 instance, which has one EBS IO1 volume attached, and one EC2 Instance Store.

Cloud Block Store architecture (not showing the S3 persistent storage layer) – Source: Pure Storage CBS Configuration Guide

Diving deeper into each virtual disk, the EBS IO1 volume is used as NVRAM and as write buffer (see here for a detailed description of IO1 volumes). The Instance Store is used as a non-persistent read mirror. This implies fast enough storage.

For each of the virtual disks, CBS requires either i3.2xlarge EC2 instances (a total of 7 for its //VA10-R1 implementation), or i3.4xlarge EC2 instances (a total of 7 for the //VA20-R1 implementation). Both EC2 instance types use NVMe SSDs, delivering adequate bandwidth and latency for read operations.

Each virtual drive acts as a read cache & write buffer, and is also responsible of writing data to the persistent storage layer, which uses AWS S3 object storage with high durability.

Needless to say, Pure Storage has extensive documentation on how to set up CBS.

High Availability & Data Services

High Availability is implemented by using AWS Availability Zones (AZs). It’s possible to use Pure Storage ActiveCluster feature to create an active cluster where one CBS instance resides in Availability Zone 1, and the other CBS instance in Availability Zone 2.

ActiveCluster performs synchronous data replication between both CBS instances, and performs a full failover if there is an outage on one of the AZs.

Data services include thin provisioning, deduplication and compression of the data. Snapshots are also supported, and are interoperable with on-premises physical Pure Storage infrastructure. Finally, the entire cloud based CBS environment is encrypted at rest.

Use cases of Cloud Block Store

Clearly, the goal is to deliver the same experience as on-premises Pure Storage infrastructure, but to use the elasticity of public cloud services to provision capacity on demand.

Perhaps the primary use case is Disaster Recovery, allowing organizations to replicate data between private and public clouds without having to worry about co-locating physical hardware at a remote data center. Replication is bidirectional and works with other Pure Storage products (FlashArray, FlashBlade) as well as NFS targets or cloud-based storage.

The other big use case for this feature is related to data protection and backups. Snapshot-based backups can be made from on-premises environments directly to the cloud. And those snapshots can be restored back to on-premises physical Pure Storage hardware, or to another cloud-based CBS instance. These backups can be done at the array level, volume level or even VM level. It is also possible to free up space from existing on-premises arrays and permanently move that data to S3, and restore it when needed.

The last use case is cloud-based high availability for cloud-native applications, although in our view this use case is a bit of a stretch, since cloud-native apps may not necessarily need block storage. Cloud-to-cloud HA however makes sense for enterprise applications that are lifted & shifted to the cloud.

Consuming Cloud Block Store

CBS is a 100% software implementation delivered as Software-as-a-Service. The licensing model is subscription-based on capacity, either on a monthly or yearly basis. Customers can either set up CBS themselves or pay a fee for basic setup.

There is some pricing information available on the AWS marketplace about CBS pricing. Here’s a screenshot as of 2-Nov-19, bearing in mind that this is at best indicative, and that Pure Storage sales folks would probably know better.

Cloud Block Store pricing – Source: AWS Marketplace, retrieved 2-Nov-19

Conclusion

Once again Pure Storage is at it and came out with a remarkable implementation of block storage, running on AWS, using cloud-native building blocks, and delivering a consistent experience regardless of the platform used.

I am delighted that Pure Storage didn’t take the easy and convenient route of just packaging their solution into a virtual appliance, throwing a big EC2 instance at it, and calling it done.

Efforts were put in delivering a block storage platform that is performance-optimized by using NVMe SSDs and high IOPS instances on their cloud storage components (at least those used for read cache and write buffer). But being down-to-earth, we need to consider that the solution is cloud-based; this is not a physical FlashArray with DirectMemory Cache.

As a closing remark, what is truly laudable is the end-to-end consistency, by providing a single management plane and a set of data services that works across storage platforms. Let’s also give thumbs up for the high availability features.

Overall, we were surprised and positively impressed by Cloud Block Store, and are happy to finally report on it. Way to go, Pure Storage.

Disclosure

TECHunplugged analyst Max Mortillaro was invited by Pure Storage to attend Accelerate 2019. Pure Storage covered travel & accommodation expenses, and provided an analyst/media conference pass. No financial compensation was received for participation.

Max also joined the Tech Field Day Exclusive event at Pure Storage Accelerate 2019 – again, no compensation was received for participation.

Any tweets, blog articles or any other form of content produced by TECHunplugged around this event are neither commissioned nor sponsored by Pure Storage. TECHunplugged analysts have no obligation to create content about the events they are invited to, and will only create content if the information provided at an event has value for their audience.