Well, ok, not counting the MSDN licence I had to play with this, but the point is thanks to the iSCSI support in Windows 2008 R2 (initiator and target), you can now build test clusters without having to have a ‘real’ (as in expensive) shared disk array, so you too can amaze your friends by live-migrating a virtual machine in front of their very eyes, or dispel your own lingering doubts that this stuff is all smoke and mirrors.
I used:
- 3x old Dell Optiplex 745’s that we got for a song
- A 100mb hub I borrowed from IT
- Er… that’s it
Using the Microsoft iSCSI target for Windows Server 2008 R2, one box pretends to be a SAN. You could use Windows Storage Server, or a high-end NAS that supported iSCSI also.
The other two boxes I stuck Windows 2008 R2 with Hyper-V role. I could have used Hyper-V server. Using the out-of-the-box iSCSI initiator, I bound both of them to virtual drives I fronted up from the storage server, and after a few goes made a cluster.
I’m not going to do the blow by blow, because there’s actually a couple of really good posts on doing this:
…though you will have to wade through them a bit, because the landscape has been changing, but before you know (well, it took maybe a few days, on and off) you have a VM flitting from box to box like a sprite[1].
Couple of things I will mention:
- The doco says you can’t do this with only one NIC per box, but you can. Wouldn’t want to in production, sure, but you can
- Though experience I suggest that the safest course is to only have the quorum disk target attached when creating the cluster (and add more disks later). That’ll prevent the wrong disk being used as the quorum disk, which I couldn’t work out how to prevent otherwise
- If you destroy the cluster (as I did, several times, when it kept getting the disks round the wrong way) and find your machines don’t talk to each other any more, try removing them from the domain and re-adding. Worked for me
- If something doesn’t work, don’t be an idiot like me and later try exactly the same thing again and waste a whole day rebuilding everything. Try it a different way :-/
[1] The actual moment of cut-over took the VM out for about 4 seconds, which isn’t terrible considering the appallingly low-spec setup I was running: disk, heartbeat and client access all hitting one NIC though 100mb hub. It was only getting about 7Mb/s on the disk too.