Virtualization Pro

Nov 3 2008   7:27PM GMT

Replacing a VMware ESX SAN

Texiwill Edward Haletky Profile: Texiwill

Replacing or upgrading a SAN is no trivial task. There are a few tried-and-true steps to take when replacing a SAN which I’ll outline in this blog post, including a key step to the process that will ensure a successful switch.

I recently upgraded from an HP MSA 1000 to an IBM DS3400 because I wanted to improve performance and lower my overall energy costs.  One of the reasons I decided to replace my old SAN is because it is much cheaper to have a single 2U device running than the three devices for the old SAN. In addition, I dropped from 42 drives to 12 drives with more storage. Minimally my SAN power costs should drop to just 1/3 the original. I also have gone from 2.5 TB to 3 TBs of storage. Not a huge increase in storage capability.

I will know next month if my energy cost reductions have been realized and will report back then.

The steps for replacing a SAN are not all that tricky, but there was a single gotcha that could be avoided with careful planning. To replace a SAN, follow these steps:

1. Plug in the new SAN to your existing fabric. Luckily I had a pair of unused fibre connections and Gbics available else this would have been another expense and a delay until the cables and Gbics arrived.

2. Find a system on which to install the management console. For the IBM DS3400 I chose my VirtualCenter and VMware Consolidated Backup (VCB) server to be the management console for the SAN. There are two methods to manage the IBM DS3400: in-band or over the fibre channel fabric, or out of band using Ethernet — even a VM would suffice given networking is connected to the SAN. Software exists for both 64-bit Linux and Microsoft Windows.

3. Create the LUNs on the new SAN. This is a good chance to correct any problems you may have with the LUN configuration on the old SAN. I did a one-to-one mapping, except I slightly increased the size of the LUNs.

4. Present the LUNs to your VMware ESX host(s) and VCB server(s).

5. Rescan the storage adapters for new LUNs using the VMware Infrastructure Client (VI Client) for the first VMware ESX host. Once this is completed, you can then add as many Virtual Machine File Systems (VMFSs) as required.

6. Rescan the storage adapters for new LUNs and VMFSs using the VI Client on all the other ESX hosts.

7. Employ Storage VMotion via the VI Client to migrate VMs from one LUN to another LUN. This works if you have the patience to move all the VMs one by one. If not you can employ other measures. If you do this, however, you will end up having to edit the VMX files for each VM migrated to change the location of the virtual disk files. There are scripts to do this for you as well. This second option, however, also requires you to power off all VMs. Use of Storage VMotion does not require any VM downtime. Be sure to move all files from the LUNs in use.

8. For a LUN with an RDM (mine was a Linux file server), use Storage VMotion to move any VMDKs related to the VM. Then map the new RDM to the VM. You will have to reboot the VM to complete. Then create a new filesystem on the new RDM mount the file system. Then you must copy all the files from the old RDM to the new RDM. I used the following command to complete this task to copy all files from /files to /files2.

  • cd /files; rsync -ravlpog * /files2

9. Then I modified the mount point for /files within /etc/fstab to be the correct new location. Finally I powered off the VM, deleted the old RDM from the VM and powered on the VM picking up the new data.

Here is the gotcha. I missed it, but it will be extremely useful for you (and me) going forward: Remove the old SAN’s LUNs from each VMware ESX host. If you miss this step when you finally disconnect the old SAN, the ESX hosts will go into a state of constantly attempting to failover the old LUNs. This will spew massive failures into the log files. If this happens there is no recourse but to reboot the VMware ESX hosts.

Now the SAN has been replaced. With the exception of dealing with any RDMs, it is possible to migrate to a new SAN without any downtime.

3  Comments on this Post

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.
  • Tdimaggio
    Enjoy the decrease in performance going from 42 spindles to 12. THAT'S a gotcha.
    0 pointsBadges:
  • Texiwill
    Hello, Actually not, I now have more spindles per LUN than before. The old HP MSA did not support meta-luns so there were only 3 spindles per LUN on the primary drive tray. There was also a major loss of performance going to the secondary and tertiary disk trays. Overall we have an increase in performance with the new SAN and better disk technology. Everything depends on the # of spindles per LUN and where or how they are connected. Best regards, Edward L. Haletky AstroArch Consulting, Inc.
    0 pointsBadges:
  • Tdimaggio
    Gotcha....Actually, you can stripe a LUN on an MSA1000 over more then 3 disks but because of the lack-luster redundancy on that particular SAN, it's too risky to do. I went from an MSA1000 to a NetApp 3020C and was in the same boat. 3 spindles per LUN. Check out running ESX on NFS as well....performance is equally as good as fiber channel *IF* you have multiple ESX servers accessing the same LUN - the sweet spot seems to be SCSI reservations, and you save a ton by not needing HBA's and fiber switches.
    0 pointsBadges:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

Share this item with your network: