VMware VCF Delivers More Value with Intel Optane Solutions


This post is part of a sponsored Intel blog post series on Intel Optane. To learn more about Intel, please visit intel.com.

Introduction

Intel Optane is the brand name of a fascinating non-volatile memory technology called 3D XPoint. 3D XPoint allows every single byte on the media to be independently addressable due to the three-dimensional, cross point structure of the media, hence its name.

Figure 1 – A view of the 3D XPoint technology at the heart of Intel Optane products

Optane technology can be used in two different ways. One is Optane Persistent Memory (PMEM), with a DIMM form factor. The other is Optane SSD, where 3D XPoint is used inside an NVMe storage media.

Optane PMEM opens the door to incredible possibilities due to the ability to handle the media similarly as DRAM, but with persistent, non-volatile properties. This is already used with some in-memory databases, and on the long term it has the potential to disrupt how applications access and store data.

But Optane has broader applications, particularly in private and hybrid clouds. It can deliver tremendous value to the modern data center, either with hybrid cloud environments based on VMware Cloud Foundation (VCF) or more traditional on-premises SDDCs based on VMware vSphere and vSAN, both of which are at the heart of VCF.

Figure 2 – Intel Optane Persistent Memory is used to increase a system’s memory tier capacity, while Intel Optane SSD is used on the storage tier to improve performance

Extending Memory with Optane PMEM

The Memory to CPU Core Count Conundrum

Until recently, CPU core count was a bottleneck in a virtual architecture, and due to limited CPU core counts it would be rare to architect virtualization hosts with more than 768 GB RAM: there wouldn’t be enough cores to justify such a high memory amount on a single node.

The only way to address the CPU core count challenge was by scaling horizontally, thus adding more nodes, each with a limited memory footprint. Horizontal scaling has its own challenges: more hardware must be purchased, with the associated costs in terms of licenses and support, and of course environmental requirements (power consumption, physical estate, cooling, etc.).

With the latest CPU architectures, we are now facing a different paradigm. CPU core counts are now rapidly increasing which allows for many more virtual machines per host. This increase in VM densities is driving the need for more memory, yet the cost and capacity of DRAM is not scaling at the same rate, thus putting pressure on the memory subsystem.

Enabling Reliable Vertical Scaling with Optane PMEM

What’s required is the ability to cost effectively increase system memory as CPU core counts increase to enable a scale-up architecture. And this is exactly where Intel Optane PMEM comes into play.

In a modern vSphere host the core count to memory ratio is ideally balanced based on the profile of the VMs. Intel Optane Persistent Memory allows for very cost-effective memory configurations at 1TB or greater, thus enabling the use of high core-count CPUs.

Large memory systems are no longer constrained by the physical limits of DRAM DIMM module capacity. The total addressable memory can be extended by using high-capacity Intel Optane PMEM DIMM modules.

The resulting memory pool consists of a combination of DRAM modules and Intel Optane modules which is controlled by the hypervisor.

The hypervisor uses DRAM modules as a fast memory cache, and stores / retrieves data on Optane PMEM modules.

This approach delivers significantly more memory capacity per node with comparable performance, and in a sensibly more cost-effective fashion, because Optane PMEM is more affordable than DRAM.

This memory and CPU combination truly realizes the benefits of vertical scaling: higher VM consolidation ratio, smaller number of nodes, smaller footprint from an environmental perspective, and last but not least reduced license costs.

vSAN Delivers More Value with Optane SSD

Enterprise-class Storage

The modern VMware vision of the hybrid cloud SDDC is built around VMware VCF. vSAN is its storage tier: a proven, well-established enterprise-class storage virtualization software that can deliver tremendous performance when well architected.

When looking at storage performance, two key metrics come to mind: throughput and latency. Latency is essential as it dictates the time for an I/O operation to complete. Throughput tells us how much data can pass through the storage system.

All-NVMe vSAN deployments perform significantly better than SATA/SAS counterparts. They can process data in a highly parallel fashion (up to 64k+ queues) and are only limited by the media speed and PCIe bus speed.

Steady Performance Under Adverse Conditions

Such systems deliver usually enough performance for most applications, but it isn’t enough for I/O intensive applications such as critical databases or real-time data processing systems. In the VMware VCF context, two use cases particularly come to mind: boot storms and HA events.

In a VDI environment, a boot storm (peak time when hundreds of users access their VDI environment, usually at start of business hours) can completely hog down a storage system that would otherwise perform well, bringing large VDI environments to their knees. Similarly, HA events in regular VCF environments (when a node is lost and all VMs need to be restarted on another node) also put a heavy strain on vSAN as all VMs are started and compete for resources.

You can either architect a vSAN environment utilizing all-NAND SSDs, or you can considerably enhance efficiency by utilizing Optane SSDs in the cache tier. Optane SSDs’ high throughput and very low latencies provide the necessary stability for vSAN to deliver optimal performance under heavy strain.

Figure 3 – Using Intel Optane SSD as a vSAN cache tier dramatically improves VM performance

Optane SSD also plays a significant role with SAS/SATA platforms. For example, HPE Synergy systems, which are very popular in EMEA, have two NVMe drive slots per system. There, Optane SSD can radically improve I/O performance and allow Synergy customers to get more value out of their investment.

Long-Lasting Endurance for Intensive I/O

Endurance is another key aspect of storage. The cache tier is the most exposed to intensive I/O operations. In heavy duty environments this tier is more vulnerable to media wear than the capacity tier. 

In NAND-based SSD drives, flash memory is subject to wear: memory cells can only sustain a finite number of writes during the drive’s lifetime before failing. This endurance will vary based on the media type (TLC drives are better than QLC), but even TLC drives with high endurance can suffer from wear.

Optane SSDs rely on 3D XPoint technology whose endurance rating is 10 to 20-fold better than TLC : where the Intel P4610 SSD has an endurance of up to 3 DWPD (disk writes per day), the Intel P4800 Optane SSD has 30 or 60 DWPD (based on the model type). Beyond better I/O performance, Optane SSDs eliminate the risk of data corruption or data loss due to media wear.

Doing More with Intel Optane SSD & PMEM

In the example below, we can see the requirements for a VCF environment meant to support around 280+ VMs with a 4:1 vCPU to Core ratio, and an average ratio of 3 vCPU per VM.

On the left side (without Intel Optane), 6 nodes would be required to support such an environment, with a total RAM count of 4608 GB and 216 CPU cores. This environment would also require larger cache SSDs (2x 1.6 TB), and a smaller usable storage capacity pool (8x 4 TB drives).

With Intel Optane, optimizations at the CPU, Memory and Storage layer allow to meet (and slightly exceed) the same requirements with only 4 nodes, delivering similar performance with a much more efficient footprint. The storage cache tier is smaller and more efficient (2x 375 GB), also allowing the support of a larger capacity pool (12x 4 TB drives).

Figure 4 – A typical VMware VCF consolidation use case with Intel Optane PMEM / SSD

Conclusion

Optane PMEM and Optane SSD work together to enable cost-effective VCF deployments without sacrificing on performance.

At the hypervisor layer, Optane PMEM significantly increases the memory capacity of virtualization nodes by complementing DRAM with durable and ultra-low latency Optane Persistent Memory, achieving similar performance compared to DRAM-only deployments, and at a significantly better price point. This enables higher VM densities, lower infrastructure footprint and reduced CAPEX / OPEX costs.

At the storage layer, Optane SSD dramatically boosts the efficiency of vSAN deployments. It enables organizations to run I/O intensive workloads, sustain I/O surges and build vSAN layers that do not have to compromise between performance and capacity.

With Optane, Intel has taken a solution-based approach that delivers value across the entire data center stack. This is particularly true of VMware VCF, where Optane-enabled systems allow organizations to get much more value out of their investments.