Dell ECS Version 3.5: A Focus on Performance
In this article
ECS is Dell EMC's current (Gen 3) Object Storage Platform and in the past, ECS has been regulated to a role of active archive and low performance use cases. Within our lab environment, we had the opportunity to test the performance of moving from ECS v3.4 to v3.5, which supports adding an SSD for each node to perform metadata management. Dell's focus to improve performance for ECS is particularly important as the industry moves toward being able to provide high(er) performance object solutions.
Why performance test?
It's clear that customers are looking to improve on the existing performance profile of their object storage solutions, whether it be for simply improving and/or meeting backup target SLAs or leveraging object for new use cases such as being able to perform streaming analytics and decreasing time for metadata searches. As object response times drop to single milliseconds for locally accessible data, we see a higher and higher appetite for our customers to displace traditional NAS workloads with performance capable object storage solutions. This allows our customers to move to a significantly more cost effective, easier to develop against and highly resilient object solution.
Below we'll discuss what we saw during our performance testing with the code upgrade to v3.5 and while we won't be sharing exact performance numbers, we're happy to go over our findings in more detail. Simply reach out to your local account team or myself for information.
What we will show here is the overall improvements from our testing of upgrading from versions 3.4 to 3.5, along with adding an SSD per node to show the impact of moving the metadata from spinning disk to SSD. We will also discuss where we were impressed and what we feel could still use improvement.
How and what we tested
We developed a basic methodology for testing object storage solutions using 12 bare metal servers in a test jig along with 40Gb capable switches for the front-end network. To generate a load we use the open source load generation program COSbench. Our goal is to be able to test different OEM solutions in an apples-to-apples manner, as well as being able to measure performance increases with new product technologies and/or software releases.
In this case we were testing the ECS model EX500 hardware made up of 16 nodes (one full rack) in a single site configuration. We used COSbench to do basic PUT and GET testing across three different file sizes that we categorized as small (100KB), medium (10MB) and large (1G). We then used COSbench to increase the load every 15 minutes thru a range of 12 to 960 workers (think of them as unique access threads) in order to show the performance knee of the curve.
Here's what we found
Going to v3.5 (with SSD for the metadata) for small files showed significant improvements for both GETs and PUTs, we even measured a larger improvement to the PUTs than to the GETs (this was unexpected). For larger file sizes however, the PUT improvements fell off quickly. Our medium file size testing showing a small performance increase for PUTs (within the margin of error considering test deviation), with no benefit at all on our large file PUTs.
However, for GETs we saw good improvements for both medium and large files, with the medium file sizes benefiting the most. We saw throughput improvements on small and medium files GETs averaging in the mid 20 to mid-30 percent range, and GET throughput improvements for large files in the mid-20 percent range. This also equated to an overall improvement in response time measured at 15-20 percent (see Figure 1 below as an example for the graphed performance output of the 100K GET testing).
Version 3.5 improvements also benefited server-side encryption performance, with an increase in the mid-30 to 40 percent ranges for GETs with encryption turned on vs. version 3.4 with encryption enabled.
Overall we rate the 3.5 version upgrade performance improvements as significant. While we were excited to see the substantial improvement in small file PUTs (~40 percent throughput increase), we were disappointed to not see that also apply to larger file sizes. While we rank 3.5 GET performance as very good to excellent, we rate overall PUT performance from fair to good.
We are excited by the new product developments coming down the pike for ECS which should see additional (larger scale) performance improvements. It's a little too soon to talk about those just yet, but standby for upcoming articles regarding ECS performance developments.
Get more details
We will be happy to go over additional details of this testing described here including reviewing the entire analysis as seen below.