Storage – Who knew it could be so cool?

I’m the first one to admit that when I first began investigating the possibility of deploying a large Citrix XenDesktop / virtual desktop environment, I knew very little about storage, and especially SAN storage.  My experience was limited to configuring local disks in various RAID levels, leveraging DFS on Windows to ensure about a terabyte of applications and data were available across campus even if one of our data centers went dark, and that’s about it.

So when my virtual desktop project started dovetailing with the “integrated compute/storage/network stack” project, I spent most of my time drooling over Cisco UCS and very little time thinking about storage.  That made sense to me because we’ve always had a dedicated team to handle storage, and aside from knowing I’d have to provide them with IOPS and spindle counts for my project, I figured they’d just carve out some volumes for me and that would be that.

Turns out I got much more involved in the conversation about storage over time, especially once we decided to go with the FlexPod, and I’m glad I did.  I can say now that, as cool as the Cisco UCS stuff is, and it is the most geek-tastic gear I’ve ever played with, the NetApp storage in our FlexPod comes in a very close second.  And one of the coolest features of our NetApp storage is de-duplication.

I’ve tweeted a couple of screenshots showing the outrageous deduplication rates I’ve been getting on some of my volumes, but I figured I’d go into more detail in this post.  Maybe folks who have worked with NetApp storage or other dedupe mechanisms won’t be as surprised by this as I’ve been, but it sure has impressed me.

My virtual desktop environment is laid out across a handful of storage volumes.  They are:

nfsxen_os01          560 GB
nfsxen_data01      2048 GB
nfsxen_data02      2048 GB
nfsxen_pfile01      1350 GB
nfsxen_vswap01  1350 GB

I’ll be focusing on the first three volumes.

20 Persistent Windows 2008R2 Servers

Volume os01 contains the virtual machine config/log files as well as the operating system vmdk for each of the 20 servers infrastructure servers for my XenDesktop/XenApp environment – web servers, database servers, zone data collectors, desktop delivery controllers, license servers, and a server for Edgesight.  All of these servers are running 2008R2 and were all deployed from the same template.  Even so, they’re all serving one of several completely different roles, some with just the OS and a bit of Citrix code dropped in, others running the OS + SQL Server, etc.  I’m seeing 79% dedupe rates on that volume, and I’m quite happy with it.

Os01 dedupe 79 skitch

XenApp PVS Images

Volume data01 contains the images we’re using with Citrix Provisioning Services for the XenApp component of my project.  I’d say this volume began its life in an almost ideal configuration for de-duplication.  Our base image contained only 2008R2, a XenApp install, Office, and a handful of plugins.  Since we maintain a number of really large (multiple tens of gigabyte) applications, we decided to fork our base image three ways for now – a “core image” that will run on about 70% of our servers, a “stats/math image” that will run on about 20% of our servers, and an “intense image” that will run on the remaining 10% of our servers.  We’re also storing multiple copies of each image on that volume, one for each of our four PVS servers, and we’re maintaining older versions of each image as well.

88% dedupe rate is pretty awesome.

Data01 dedupe 88 skitch

XenDesktop PVS Images

Volume data02 contains the images we’re using with Citrix Provisioning Services for the XenDesktop component of my project.  Like the XenApp PVS volume, this is probably one of the best candidates for de-duplication imaginable.  Our base image contains Windows 7 x64 SP1, Office, and a handful of plugins.  Aside from configuring a few variants of that to stream to different models of Dell computers, we haven’t really had cause to consider forking or expanding our base image.  I know we likely will, and I know at some point someone is going to make a good argument for giving some users persistent desktops, but for now I’m sticking with pooled desktops.

89% dedupe rates – wow.

Data02 dedupe 89 skitch

So is storage cool?  It definitely can be.  I’m going to keep an eye on these savings and percentages as we grow and especially as we run into more apps for which the promise of application virtualization doesn’t fully deliver and we end up baking more and varied applications into our images.

*Update* Out of curiosity and because others have asked when I’ve talked about how well my environment is de-duplicating, I took a peek at one of the “general virtualization” volumes to see how well it is de-duplicating.  Only 40%, but that doesn’t surprise me as that volume has 2003, 2008, and 2008R2 servers on it, as well as various development Sharepoint & SQL Server instances.  Good news is that’s 40% we weren’t seeing on those environments before.  Even better news, for me, is that I get to keep all that varied and sundry stuff out of my much more standardized environment.

This entry was posted in Hardware, Storage and tagged , . Bookmark the permalink.

3 Responses to Storage – Who knew it could be so cool?

  1. friea says:

    AWE. SOME post, Mike. Really appreciate the detailed breakdown and summary.

  2. Mike, great post. It shows how much brains and engineering effort is required to take a reference architecture like FlexPod from PDF to reality – job well done!

  3. Mike Stanley says:

    Thanks, Stevie. I’ve seen the “FlexPod is only a reference architecture” phrase thrown around, but it didn’t play out that way for us. We solicited competing proposals from NetApp and EMC and purchased a complete product, delivered as such to our datacenters. It didn’t take any brains or engineering effort on my part to make that happen, just a P.O.

    My brains and engineering came into play post-delivery, just like it would have if we’d bought the FlexPod competitor.

Comments are closed.