Of all the PC upgrades that I’ve ever done in the past the one that’s most notably improved performance of my rig is, by a wide margin, installing a SSD. Whilst good old fashioned spinning rust disks have come a long way in recent years in terms of performance they’re still far and away the slowest component in any modern system. This is what chokes most PC’s performance as the disk is a huge bottleneck, slowing everything down to its pace. The problem can be mitigated somewhat by using several disks in a RAID 0 or RAID 10 set but all of those pale in comparison when compared to even a single SSD.
The problem doesn’t go away for the server environment either, in fact most of the server performance problems I’ve diagnosed have had their roots in poor disk performance. Over the years I’ve discovered quite a few tricks to get around the problems presented by traditional disk drives but there are just some limitations you can’t overcome. Recently at work the issue of disk performance came to a head again as we investigated the possibility of using blade servers in our environment. I casually made mention of a company that I had heard of a while back, Fusion-IO, who specialised in making enterprise class SSDs. The possibility of using one of the Fusion-IO cards as a massive cache for the slower SAN disk was a tantalizing prospect and to my surprise I was able to snag an evaluation unit in order to put it through its paces.
The card we were sent was one of the 640GB ioDrives. It’s surprising heavily for its size, sporting gobs of NAND flash and a massive heat sink that hides the propeitary c ontroller. What intrigued me about the card initially was the NAND didn’t sport any branding I recognised before (usually its recognisable like Samsung) but as it turns out each chip is a 128GB Micron NAND Flash chip. If all that storage was presented raw it would total some 3.1 TB and this is telling of the underlying infrastructure of the Fusion-IO devices.
The total storage available to the operating system once this card is installed is around 640GB (600GB usable). Now to get that kind of storage out of the Micron NAND chips you’d only need 5 of them but the ioDrive comes with a grand total of 25 dotting the board. No traditional RAID scheme can account for the amount of storage presented. So based on the fact that there’s 25 chips and only 5 chips worth of capacity available it follows that the Fusion-IO card uses quintuplet sets of chips to provide the high level of performance that they claim. That’s an incredible amount of parallelism and if I’m honest I expected these chips to all be 256MB chips that were all RAID 1 to make one big drive.
Funnily enough I did actually find some Samsung chips on this card, two 1GB DDR2 chips. These are most likely used for the CPU on the ioDrive which has a front side bus of either 333 or 400MHz based on the RAM speed.
But enough of the techno geekery, what’s really important is how well this thing performs in comparison to traditional disks and whether or not it’s worth the $16,000 price tag that comes along with it. Now I had done some extensive testing of various systems in the past in order to ascertain whether the new Dell servers we were looking at where going to perform as well as their HP counterparts. All of this testing was purely disk based using IOMeter, a disk load simulator that tests and reports on nearly every statistic you want to know about your disk subsystem. If you’re interested in replicating the results I’ve got then I’ve uploaded a copy of my configuration file here. The servers included in the test are Dell M610x, Dell M710HD, Dell M910, Dell R710 and a HP DL380G7. For all the tests (bar the two labelled local install) all of them are a base install of ESXi 5 with a Windows 2008R2 virtual machine installed on top of it. The specs of the virtual machine are 4 vCPUs, 4GB RAM and a 40GB disk.
As you can see the ioDrive really is in a class all of its own. The only server that comes close in terms of IOPS is the M910 and that’s because it’s sporting 2 Samsung SSDs in RAID 0. What impresses me most about the ioDrive though is its random performance which manages to stay quite high even as the block size starts to get bigger. Although its not shown in these tests the one area where the traditional disks actually equal the Fusion-IO is in terms of throughput when you get up to really large write sizes, on the order of 1MB or so. I put this down to the fact that the servers in question, the R710s and DL380G7s, have 8 disks in them that can pump out some serious bandwidth when they need to. If I had 2 Fusion-IO cards though I’m sure I could easily double that performance figure.
What interested me next was to see how close I could get to the spec sheet performance. The numbers I just showed you are particularly incredible but Fusion-IO claims that this particular drive was capable of something on the order of 140,000 IOPS if I played my cards correctly. Using the local install of Windows 2008 I had on there I fired up IOMeter again and set up some 512B tests to see if I could get close to those numbers. The results, as shown in the Dell IO contoller software, are shown below:
Ignoring the small blip in the centre where I had to restart the test you can see that whilst the ioDrive is capable of some pretty incredible IO the advertised maximums are more than likely theoretical than practical. I tried several different tests and while a few averaged higher than this (approximately 80K IOPS was my best) it was still a far cry from the figures they have quoted. Had they gotten within 10~20% I would’ve given it to them but whilst the ioDrive’s performance is incredible it’s not quite as incredible as the marketing department would have you believe.
As a piece of hardware the Fusion-IO ioDrive is really the next step up in terms of performance. The virtual machines I had running directly on the card were considerably faster than their spinning rust counterparts and if you were in need of some really crazy performance you really couldn’t go past one of these cards. For the purpose we had in mind for it however (putting it inside a M610x blade) I can’t really recommend it as it’s a full height blade that only has the power of a half height. The M910 represents much better value with its crazy CPU and RAM count and the SSDs, whilst being far from Fusion-IO level, do a pretty good job of bridging the disk performance gap. I didn’t have enough time to see how it would improve some real world applications (it takes me longer than 10 days to get something like this into our production environment) but based on these figures I have no doubt it improve the performance of whatever I put it into considerably.
The government department I’m currently working for recently embarked on buying a new HP Blade environment to upgrade their VMware cluster, something which I had a big hand in getting done. It was great to see after 5 months of planning, talking and schmoozing management that the hardware had arrived and was ready to be installed. My boss insisted that we buy services from HP to get it set up and installed, something which I felt went against my skills as an IT professional. I mean, it’s just a big server, how hard could it be to set up?
The whole kit arrived in around 27 boxes, 2 of them requiring a pallet jack to get them up to our build area. This was clearly our fault for not ordering them pre-assembled and was an extraordinary tease for an engineer like myself. I begrudgingly called up HP to arrange for the technician to come out and get the whole set up and installed. This is where the fun began.
After chasing our reseller and our account executive I finally got put onto the technician who would be coming out. At first I thought I was just going to get someone who knew how to build and install these things in a rack, something I was a bit miffed about spending $14,000 on. Upon his arrival I discovered he was not only a blade technician but one of the lead solution architects for HP in Canberra, and had extensive experience in core switches (the stuff that forms the backbone of the Internet). Needless to say this guy was not your run of the mill technician, something I’d discover more of over the coming days.
The next week was spent elbow deep in building, installing and configuring the blade system. Whilst this was a mentally exhausting time for myself I’m glad he was there. When we were configuring any part of the system he’d take us a step back to consider the strategic implications of the technology we were installing. I made no secret that I barely knew anything about networks apart from the rudimentary stuff and he did his best to educate me whilst he was here. After spending a week talking about VLANs, trunks and LACP I firmly understood where this technology was taking us, and how we could leverage it to our advantage.
Initially I felt very uncomfortable having someone constantly question and probe me about all the principles and practicies of our network. I’m not one to like being out of control, and having someone who is leaps and bounds smarter then you doing your work makes you seem redundant. However this all changed after I got up to speed and starting asking the right questions. It began to feel less like I was being lectured and more I was being led down the right path. Overall I’m extremely happy with my boss’ decision to bring this guy in, as the setup I would have done without his help would have been no where near the level that it is today.
In any workplace it’s always hard to work with someone who’s a lot smarter then you, especially if they’re your subordinate. Whilst I can’t find the original source for this quote (paraphrased) I’ll attribute it to my good friend, Nick:
A bad manager will surround themselves with people who either agree with everything they say or aren’t as smart as them. A good manager will have a team of people who are much smarter in their respective fields then them and use their advice to influence their business decisions.
So whilst I felt inferior because my boss didn’t believe I was capable and the architect was leaps and bounds above my skill level in the end it turned out to be a great benefit to everyone involved. From now on I’ll be looking at decisions like this in a new light, and hope this is a lesson that all the managers out there can take to heart.