A frequently asked question in customer sessions is, what's coming next?  It's a fair question and one we answer based on general technology trends and OEM-specifics. We are briefed with longer term roadmaps as our OEM partners seek guidance in prioritizing items in their development backlogs.  However, what we don't necessarily see in those roadmaps are the distant things that are further out in the general technology pipeline that haven't made it down the funnel to production-ready.  There are groups within the technology industry that are working further out than eighteen months on a plethora of things.  These groups tend to be parts of industry organizations that create standards for future products to adhere to, with the goal of maximum interoperability.  These standards bodies are perhaps familiar names like IEEE, PCI-SIG and JEDEC.  In the storage space, we have the Storage Networking Industry Association, or SNIA, for short.  They have many specialized working groups within, that deal with things like SSDs, security, cloud and green storage, amongst others, and are open to anyone with an interest in moving data technologies forward.

A couple of us from WWT recently attended the SNIA Developer Conference (SDC) in Santa Clara to learn more about what's happening in the broader storage industry.  It was three days of sessions delivered by people deep within the storage world creating the next big thing, standards for the next big thing, improving what exists today, or using existing tools in interesting ways.  While there were numerous topics, I want to share a few specific topics and themes I found particularly interesting.

Archival Storage

Tape has been around for a long time and is a very stable media in a durable case, just don't lose the leader pin.  However, like those Guns and Roses tapes in your high school car, you have to spool them to the right spot before listening to November Rain or restore somebody's deleted MP3 of November Rain.  Additionally, the belief is tape is unsuitable for data needing to be kept for hundreds of years; LTO claims 30 years of archival life, which is insufficient for things like medical records that need to be retained for the life of the patient, or schematics for products that have long service times like aircraft engines. Today's offsite tape copies do a lot of movement and rewriting that keeps those copies fresh.  To create more permanent copies, a research project at Microsoft and another similar effort at Cerabyte replace tapes with glass platters.  Both utilize femtosecond lasers to etch bits into the glass, creating a permanent record.  Cerabyte is using a thin coating onto which they write something akin to QR codes for storing data, all in a single layer.  Microsoft's Project Silica etches voxels in multiple layers inside the glass by polarizing those laser pulses.  Aside from permanence, a key principle of both projects is to improve the time-to-first-byte over tape cartridges.  Once the glass is loaded into the reader, lasers quickly access the exact data needed, like optical discs we use today.  Both products are aimed at the fifteen hour SLO storage tier, which is adequate for their intended market.

For these extremely long term storage technologies, a major question lurks in the minds of these folks.  It's one thing to have and keep data for extended periods, but what about the applications from whence it came?  Can your 2345 version of SAP read data from SAP 2015?  Also, as we continue to advance our media storage technologies, will we even be able to read the glass cartridges down the road?  LTO drives have certain backwards compatibility, but not necessarily 200 years' worth.  Does the glass platter need to have some sort of Rosetta Stone on it to help make it readable?  Data is useless without knowing what it is or how to read it, like giving me book printed in Braille.  If you know anything about the golden records launched with the two Voyager probes, they were etched with pictorial information to show where Earth is, as well as how to get the audio and photo information off the record. 

Efficiency and Circularity

In the storage world, humanity builds and ships a lot of drives each year.  This article points out 1,320 exabytes of spinning drive capacity and 277 exabytes of SSD storage shipped in 2022. On the other side of the coin, we also scrap a lot of retired drives each year and there are a lot of rare and finite resources that go into those drives, particularly flash drives.  This becomes relevant when considering scope three emissions reporting requirements.  Though the drives have calculated failure times, those are statistical averages, so if we extend the in-service time of the drives, the overall footprint of that media drops significantly.  Unfortunately for scope three emissions reporting right now, the initial purchaser of the drive bears the entire lifetime cost of the embodied emissions; work is underway to allow amortization of those emissions.  The net of it is, they estimate getting 70% more life out of a drive reduces total emissions by 40%, placing the bulk of embodied emissions in the energy to run the drives.

Reducing Data Movement

Throughout the show, they discussed efforts to minimize overall data movement which reduces overall cost, in time, transport pipe cost, and the energy needed to accomplish the movement.  There are some memory manufacturers investigating placement of compute resources on memory chips or on the DIMMs, allowing for compute to get extremely close to the low latency, high speed storage in the host itself.  On a different front, there is work being done on what's called computational storage.  This would be an extensible way to have your persistent storage devices do some processing and feed the results back to the host application, freeing up the data pipe and host processor, while allowing the small ASICs on the drives to do the work at a lower power cost, incurring the transit cost only for things that need to be moved to core computing resources. 

Parallelism

After minimizing data movement, for the bits that do need to be moved, parallelism is a key component of keeping ever-faster processors fed.  Certainly, things like SMB multichannel and parallel NFS (pNFS) are not new.  The difference now is, we have applications that can take advantage of those wide pipes, hence the renewed interest. Companies present at the conference, like SerNet (Samba), Microsoft, and others, commented on updates for SMB, including the rarely-discussed multichannel feature.  SMB multichannel, in many cases, happens by default when either multiple interfaces are in use and/or when receive-side scaling is enabled for the NIC.  Similarly, attention was paid to the different ways to do it with NFS, via techniques like:

  • pNFS, which separates data and metadata
  • nconnect, which opens additional TCP streams between the client and server
  • Client side drivers for scale-out systems like Vast Data Platform and Dell's PowerScale that allow the client to connect to multiple (or all) nodes in the storage system

Whether NFS or SMB, the net goal is to utilize more of the available bandwidth by opening additional concurrent connections.

Security

As the thing that keeps people up and down customer IT orgs awake at night, data security featured strongly at SDC.  In particular, the US Commercial National Security Algorithm suite (CNSA) 2.0 update triggers numerous changes in devices across the spectrum.  The US government has selected several encryption algorithms for the post-quantum computing world.  The predicted cryptographically-relevant quantum computers make it faster to break current encryption algorithms relative to traditional computing techniques so these new algorithms were selected because they should be more resistant to quantum attacks.  That's just quantum, but there are needs to secure the systems we have today with things like OS and firmware signing and device attestation.  If we assume a seven-year lifespan for certain devices in the data center, that means anything purchased today will have to support at least updated firmware signing algorithms by 2030.

CNSA 2.0 Timeline 
2022 2023 2024 2025 2026 2027 2028 2029 
Software/firmware signing 
Web browsers/ servers and cloud services 
Traditional networking equipment 
Operating systems 
Niche equipment 
Custom application and legacy equipment 
e

Source: https://media.defense.gov/2022/Sep/07/2003071834/-1/-1/0/CSA_CNSA_2.0_ALGORITHMS_.PDF

Conclusion

The use of 'developer' in the name, SNIA Developer Conference, is less developing software and more developing the future of data technologies, whether it's software or hardware. Coming out of SDC '24, we were fascinated by the things afoot in the storage world.  Even more than that, it was wonderful to see representatives from competing companies like various drive manufacturers come together to discuss a topic and witness the camaraderie they share.  These were not competitors but peers with common interests working to advance data technologies together.  Additionally, these were all really smart people with deep focus on and great passion for their areas in the storage world. 

We heard about the experimentation happening at places like HPE, Amazon, and Microsoft and the things they're feeding back into the community. It's hard to say when or how many of these efforts will make it into shipping products, but the general data market is multifaceted with a lot of different needs, particularly as you get into embedded, trusted, cannot-be-wrong systems like automobiles that carry more compute power than my 286 PC clone.  Most importantly, even if it doesn't become a shipping product itself, the work is not throwaway; it becomes intellectual capital for something else down the road. 

My peer, Bryan, and I both agreed that while the OEM conferences are good for hearing about released or soon-to-be-released products, SDC '24 sparked genuine intrigue for us both.  A huge thank you to SNIA for putting it on and for the continued work they do for the world of data.