The Major Lab Migration

Hyper-V   /   Feb 3rd, 2019   /   0 COMMENTS   /  A+ | a-
Recently, I decided that I needed to upgrade my lab to a cluster from a single node.  I was running out of resources on the single node and it was a pain in the ass to have all of your stuffs on a single node and patch it.  I had previously used Cluster-Aware Updating in a cluster and thought it was time to give it a go.  I also had a few other requirements:

- No performance degredation
- Additional capacity (Primarily Memory)
- Highly available (obviously)

Let me start off by saying that the original box was the following:
System Manufacturer    Dell Inc.
System Model    Precision WorkStation T7500
Processor    Intel(R) Xeon(R) CPU           X5675  @ 3.07GHz, 3059 Mhz, 6 Core(s), 12 Logical Processor(s)
Processor    Intel(R) Xeon(R) CPU           X5675  @ 3.07GHz, 3059 Mhz, 6 Core(s), 12 Logical Processor(s)
Installed Physical Memory (RAM)    192 GB

It had an 8 port RAID controller with 8 SSDs connected to it where I stored all of the VMs.  This drive was wicked fast and was capable of sustaining a shit-ton of I/O bandwidth.  I never did any throughput tests, but I do know it was wicked fast enough for what I needed it to do.  I had about 35 or so VMs running on it.

So, the migration.  How might one go about migrating these VMs, yet use the same storage and go from a single node to a cluster?  Well, here's what I did.  It worked, but wasn't pretty.

First thing I did was shut down all of the VMs and export them to another file system (just in case).  Remember this is a LAB and not production for me and I treat it as such.  If I were to lose ALL of it, it would just hurt for a few hours while I spun up another.  However, I should probably point out that I'm not the only person that uses my lab.  There are other folks that use it and, well, that pain might be a little harder to swallow for them.  Let's just say that I did all this grunt work for the sake of them not having to rebuild their lab they've spent countless hours in testing things.  I suppose I also did it because I knew I would learn a little bit along the way.  I like a good challenge every now and then.

The game plan:
1. Move the RAID Controller / SSD drives to an already existing NAS appliance
2. Present the hardware RAID as an iSCSI device to the cluster
3. Make it a CSV disk so all nodes can access it in the cluster
4. Import the VMs without losing any configuration

The first step was easy.  I physically removed the RAID controller and drives from the system and installed it in my FreeNAS appliance.  The part that is important here is that when I connect to the iSCSI target (the FreeNAS appliance), I "see" both volumes.  One that has always been there, and this "new" one that I just added.  This could be a problem if the other server that I use to connect to the target accidently brings the disk online while the cluster is using it.  There is only one other system that accesses this target, but just to be on the safe side I decided it wouldn't be a bad idea to disable the disk in device manager on the other system.

Here is what that looks like in Device Manager on the other system:

The really interesting part about this is that this system is actually a VM running on the cluster.  I'll let that sink in for ya.

Neat, now from this "other" system I can't accidently bring that disk online while the cluster is using it potentially causing a catastrophy.  With it disabled, it doesn't even show up in Disk Manager.

Presenting the physical disk as an iSCSI volume from FreeNAS was relatively simple.  Here's what that looks like:

Notice the 2nd extent for the device /dev/mfid0

Now on to accessing the storage from the cluster and importing the VMs.  Now is also a good time to point out that I originally added the extent as read-only (far right side of screenshot above).  This was to protect the integrity of the data while treading these theoretical and personally untreaded waters.  Yeah, it was a personal first that in theory should work but I'd never done it before.  I've jacked around with storage enough to know that you take all the precausions you can.  This comes with the experience I picked up in the world of Business Continuity.  Also, remember that if you do connect to it while it is read-only and later change it to writable, you will have to disconnect the initiator from the target and reconnect to it.  The read-only flag is "decided" at mount time until it is dismounted / reconnected.

Moving on... The most difficult part here was running the proper PowerShell commands.  I wouldn't consider myself a PowerShell junkie as I'd much rather just write something in C# to get the job done.  Nonetheless, here goes.

The storage is referenced by each node in the following location:

I have configured the Hyper-V defaults on each node to use this new location as you can see here:

I'm not sure if this really matters, but it bugs the hell out of me if it isn't configured.  These two paths are probably the first thing I always change immediately after I install the Hyper-V role.

Just to make sure Cluster & Hyper-V were operating normally, I made sure that I passed the Cluster Validation and that I was able to Live-Migrate VMs around to each of the nodes without problem.  I also configured Cluster-Aware Updating, which is kinda nice.  It can sometimes be a pain in the ass, but configured properly it works pretty well.

Ok, so now for the import.  First thing you have to do is generate a compatability report.  To do that, you run the following PowerShell command:
$vmReport = Compare-VM -Path 'C:\ClusterStorage\SSD01\Hyper-V\Virtual Machines\0C42486A-184E-4031-BB59-FF500F0E5884.vmcx'
This puts the report in a variable so you can manipulate it later.

To view the report, run the following command:
That should spit out something similar to this:

The important part to note of the above screenshot is the two Incompatibilities: {40010, 33012}

Let's look at them to see what they are.  To do that, we have to run the following commands:
Those two commands should return something similar to the following:

The first one is telling us that it can't find the VHDX file for the VM and the second one is telling us that the network switch that the network adapter is connected to doesn't exist.

To fix those two incompatabilities, you have to simply modify them.  Run the following command to change the VHDX path:
Set-VMHardDiskDrive $vmReport.Incompatibilities[0].Source -Path 'C:\ClusterStorage\SSD01\Hyper-V\VHDs\NAME.vhdx'
Replace "NAME" with the actual name of the VHDX file.  Oh, yeah, you should ALWAYS name your VHDX files with a naming convention that indicates the drive location.  For instance, I use the following naming convention:
[VM NAME] - [DRIVE0].vhdx

I've also been known to include the adapter in the naming convention when using multiple or different controllers.  Something like this will suffice:
[VM NAME] - [IDE0] - [DRIVE0].vhdx
[VM NAME] - [SCSI0] - [DRIVE0].vhdx

Another consideration is to write a script that creates a text file on the root of the VHDX file to indicate what drive letter it is.  That way you can just mount the VHDX file and see what drive letter it was to make sure it is correct in the VM.  If you don't use such a naming convention, then I wish you the best of luck in your migration.

Now to change the second incompatibility, I decided to just disconnect the adapter from the switch since I will add it to another switch later.  To do that, run the following command:
$vmReport.Incompatibilities[1].Source | Disconnect-VMNetworkAdapter
It should be pointed out that I'm using the elements of the Incompatibilities array.  You want to make sure you use the correct elements.  Those elements are defined as [0] & [1] in the command.  Also keep in mind that you may have more than just one incompatibility.  Each incompatibility is separated by a comma in the array.  So to view any additional elements you just increment the element [N] in the command.  You have to view each incompatibility to know what they are in order to change them.

Now that we've modified the configuration, we have to run it through Compare-VM again to see if we still have incompatabilities or not.  Run the following command:
Compare-VM -CompatibilityReport $vmReport
The output should look similar to the following:

Notice we no longer have any Incompatibilities listed.

Sweet, now let's import it.
To do that, run the following command:
Import-VM -CompatibilityReport $vmReport

That should show us something like this:

Done.  Now to write something that automates all of this to the Nth degree.

Update #1:

Damnit, I failed to mention the most important part in my plan.  In step 4 I said, "without losing any configuration".  What I mean by that is that when I boot the VM, it boots exactly the way it was originally booted.  No new hardware and what not to configure such as NICs and the like.  Just turn it on and *Fred Savage*

The important step there is to make sure the virtual hardware looks exactly like it previously looked.  For the majority of my VMs that meant simply setting the MAC on the NIC to be static and carried over from the previous Hyper-V host.  That's easy enough.  Here's what that looks like (Obviously with a static MAC):

Update # 2:

In such haste to get the blog published, I also failed to fulfill one of my requirements.  Well, I suppose I failed to discuss it in this post.  Basically, I forgot to configure the role, the new VM Role, as highly available to the cluster.  At this point the VM is only "visible" by the node we imported it on.  To do that you select "Configure Role..." in the top right hand corner of Failover Cluster Manager.

After you close the "Before You Begin" window without reading it, you select Virtual Machine from the list of roles as shown here:

After moving on, Cluster will look for VMs that are currently not highly available and present you with a list that you can choose those in which you want to configure.  That screen looks a little like this one:

Afterword, you are asked to confirm the VMs and then Cluster does its thing.

In the summary of the High Availability Wizard, you will have an option to view the report of the change to understand if there are any additional things you need to do to ensure they're highly available.  Most of the time, I see warnings indicating that an ISO was left in a drive and I just need to remove it.
No comments posted...

Leave a Comment

Very catpcha image