The Real (and Virtual) Adventures of Nathan the IT Guy:

Storage

Oct 7 2009   1:55AM GMT

Understanding VMware Snapshots.



Posted by: Nathan Simon
vmware, ESXi, ESX, Basic System Administration, commiting snapshots, how much space is needed to commit a snapshot, insufficient space to commit snapshot, Storage, vmware backup, delta file, snapshot manager

I wanted to clear the air on snapshots, everyone is always asking, how much space does it take to remove a snapshot, but no one has a real answer…  whatever the answer may be this information inserted from the Basic System Administration PDF, which is readily available from VMware and can be downloaded from this link here, will help to explain how they work, and what can be done if you run out of space and cannot commit a snapshot. Please take your time and read through the below information and you will get a much better understanding of snapshots. The information below is a copyright of VMware. I have in no way altered its content.

“The Understanding Snapshots section does not include information on delta disks. The section should contain the following content:

To take a snapshot, the state of the virtual disk at the time of taking the snapshot must be preserved. When this occurs, the guest operating system cannot write to the VMDK file. The delta disk is an additional VMDK file where the guest is given write access. The delta disk represents the difference between the current state of the virtual disk and the state at the time of the previous snapshot. If more than one snapshots exist, delta disks might represent the difference (or delta) between each snapshot. Also, the guest can write to every single block of the virtual disk causing the delta disk to grow as large as the base VMDK of the virtual machine.

NOTE To consolidate all snapshots into the base virtual machine, you might need extra disk space, as large as the base VMDK.

When a snapshot is deleted, if a user chooses to merge the changes between the snapshots to the previous disk‐state, all the data from the delta disk that contains the information about the deleted snapshot is written to the parent disk. This might involve a large amount of disk I/O and might reduce the virtual machine performance until consolidation is complete.

If the user chooses to ignore the delta disks, delta consolidation is not required.

See VMware Knowledge Base system for more information on the iterative snapshot deletion behaviour. I’ve Included the details of the (KB article 1003302).

Details

If you try to initiate a Delete All snapshot for a virtual machine using Snapshot Manager, and if that virtual machine is on a datastore that does not have sufficient space for the snapshot, the following message displays in VMware Infrastructure (VI) Client:

msg.hbacommon.outofspace: there is no more space for the redo log of <VMname>-0000xx.vmdk.

You are given the option to abort or retry.

  • If you choose Abort, the virtual machine is powered off, the snapshot is aborted, and a Consolidate Helper snapshot is created. The Snapshot Manager UI displays that Consolidate Helper snapshot. You can delete the Consolidate Helper snapshot after you have made space available.
  • If you click Retry, the Snapshot Manager returns to Consolidate Helper snapshot mode unless you have made more disk space available.

Solution

Free up disk space if possible, or extend the VMFS volume using VI Client.

To extend the VMFS volume:

  1. Select the host on which the virtual machine resides and click the Configuration tab.
  2. Select the datastore on which the virtual machine resides and click Properties.Note: If there is no available storage, a new LUN must be presented to every ESX host that can see the LUN.

  3. In the dialog that appears, click Add Extent and follow the prompts in the Add Extend wizard to add an extent.
  4. Perform a rescan on every ESX host that is being presented the new LUN so that the addition of the extent is detected.
  5. After you have extended the VMFS volume, you can check the Retry option of the Redo log pop-up.

Caution: When using Delete All in the Snapshot Manager, the snapshot furthest from the base disk is committed to its parent, causing that parent snapshot to grow. When that commit is complete, that snapshot is removed and the process starts over on the newly updated snapshot to its parent. This continues until every snapshot has been committed. This can lead to an aggressive use of additional disk space if the snapshots are large. Use care when exercising this option if there is not much space available on the datastore.”

Feb 25 2009   2:55PM GMT

WinDirStat



Posted by: Nathan Simon
Windows Directory Statistics, Space audit, portable app, Hard drive space

Windows Directory Statistics is an app made freely available from Bernhard Seifert and Oliver Schneider.

I use this program alot mainly after it became available on Portableapps.com

The useful little app let you audit hard disk drive space and sorts the directories by amount of space taken.

You can then right click on the folder to browse the various options of that folder selection, like “Explore Here”, “Command Prompt Here”, “Delete (to Recycle Bin)”, or “Delete (no way to undelete).

If you’d like to download and try this app get it here.

NS


Feb 6 2009   9:24PM GMT

ESX 3.5, IDE, and Me



Posted by: Nathan Simon
ESX, ESX 3.5, sata, ide, NCQ, ESX and IDE, upgrade path, vmfs

What the heck am I talking about you say? Well I decided to load ESX 3.5 on a workstation of mine. This is so I could do a test upgrade from 3.5 to 3.5 Patch 3, mainly because I want to make sure VMFS and the VMs are retained after upgrading to said patch level.

So I commenced installing ESX 3.5 on my 40GB IDE drive, yeah I said IDE, who uses those anymore anyways, apparently I do. The install went fine, for the most part, until it said that I wouldn’t be able to use the current drive as a datastore (without advanced configuration). Anyways on I go, I finish the install and all is fine. I was able to connect using Virtual Infrastructure Client, but then I see the message that ViC could not find any static storage, click here to configure a datastore. I clicked on the link and of course nothing was there… did some searching and it turns out that ESX 3.5 does not support IDE drives as a datastore due to the fact that NCQ(native command queuing) is missing from IDE Drive, however SATA drives will work fine. So what I ended up doing is I just installed a 80GB Sata drive, yes small i know, but its only for testing purposes, and rebooted the ESX machine, from ViC. Upon rebooting it found the new hardware and I was able to use ViC to add the new datastore… all 80GBs of it, all right!

Moral of the story… SATA/SCSI okay, IDE not okay :)

Till next time,

NS


Jan 28 2009   3:56AM GMT

Symantec Backup Exec Fun



Posted by: Nathan Simon
Symantec Backup Exec 12.5d, SGMON.EXE, backup exec compression, compression not enabled, how to check for compression, backup failing, Symantec Backup Exec, Symantec Backup Exec Speed issues, tape backup

Backup Exec is a great application, my personal favorite. Recently I had issues with a backup, as you can see in one of my previous posts… the backup works fine, although compression will not turn on. It is known by most Backup Exec experts that Symantec Backup Exec enables compression when the job starts(if compression is configurable), but how do YOU know if Backup Exec is doing its job and possible the drive is to blame?

Well here is the answer, its called SGMON, its a debug tool that is used to view debugging information for Backup Exec components and services. SGMON is in “C:\Program Files\Symantec\Backup Exec\SGMON.exe for Backup Exec 12.5d installations. Earlier revisions its buried in the Symantec Backup Exec folder, best bet is to just search for sgmon.exe and you’ll find it easy enough. 

Run SGMON, check off Job Engine; as below…

Click “Capture to File also. Leave SGMON running while you start a job, wait for the job to finish, next thing you need to do is locate the file. For Backup Exec 12d its located here; “C:\Program Files\Symantec\Backup Exec\Logs\servername-SGMON.log”

Do a search in the log file for compression, as below…

Once you are sure that Backup Exec is doing what it is supposed to do, and compression isn’t working as you can tell under the media tab. and the appropriate Media Set as shown below… MIND You when compression just isnt working you will see 1:1 instead of 2.3:1.

SGMON can be used for many things, you can find the official documentation here

Anyways, I’m off for now.

Any Questions you know what to do!

NS


Jan 6 2009   3:54AM GMT

Slow system…?



Posted by: Nathan Simon
Nforce, antech, 550watt, power supply, slow system, pentium dual core, pentium 4, sata, RAID stripe size, Windows XP x64

Well have you ever checked the hardware monitor section in your BIOS, this will show you what your powersupply is running at along with Fan speeds and CPU Temperatures. If left unchecked most BIOS’ will not warn when there is an issue with the powersupply, depending on the severity of course. For instance I had this Pentium D System, Pentium 4 at 3.6Ghz, running a SATA STRIPE. The Chipset was Nforce4. The 550watt power supply installed was made by Antech. A pretty good system considering it was 1 year old. We decided to install Windows XP x64 Edition and immediately noticed slow downs and programs wouldn’t install properly. Well we tinkered with some settings for a while and finally decided to pull the plug, wish I would have done it earlier, but hey hindsight is 20/20 right? No ones perfect.

Anyways the moral of the story is that the powersupply wasn’t supplying the mainboard with the proper amount of voltage in the 5V range, it was out by more then 5% and that is bad for the motherboard, CPU, and I/O.

We ended up replacing the powersupply and the 5V voltage was now stable.

Unfortunately well have to reinstall the OS, as there is some residual corruption. Can someone say FORMAT!!! :)

NS


Jan 5 2009   1:06AM GMT

IBM ServeRaid



Posted by: Nathan Simon
ServeRaid, Windows Server 2003, tape backup, backup to disk, dumplog, ibm, xSeries, ServeRaid Manager

Here’s an interesting issue that I came across. I have a few clients who are still using IBM Servers, the server referred to here is an x226 Server and a ServeRaid 7 controller.

The client said that he would come in to work in the morning to check the backups and he would have to put in a reason as to why the server shutdown. He mentioned that the backups have also not been working for the last week or so. Just to let you know this wasn’t my fault, the client is not managed services and it is up to him to monitor and let me know if they are having issues. Anyways back to the blog :). We came in to troubleshoot the backup and found that we could run a test backup. Just in case a backup to disk was also setup. Next morning it was reported that the backup failed again.

I went onsite and used IBM UpdateXpress, this CD provides an all in one firmware update for all supported components contained within the server. You can download it here. I Updated the Server’s BIOS, ServeRaid Bios/Firmware, and also the firmware of the drives themselves.

I also ran an app from IBM’s Site called Dumplog, this will “dump” the configuration and event logs from the ServeRaid controller, don’t try to decipher any of the info in the txt file, you need to send this to IBM and they will tell you what your next step it based on the info contained within that file. Download it here

Well to make a long story short… the server would crash when stressed with I/O. I figured it had to be the controller, so I ran the onboard diagnostics, and sure enough the ServeRaid test failed. I exported the test log to a text file so I could send it to an IBM Tech. Once the IBM Tech saw the dumplog files he was able to tell me that a specific drive was failing, although the drive was not reporting it to the controller properly, thus the global hot-spare wasn’t kicking in. I ended up running ServeRaid Manager and marking the bad drive  defunct, then I pulled the drive out of the server. The global hot-spare then kicked in and the rebuild started.

All seems well. It would have been nice if the drive just marked itself bad in the beginning and the issue would have been resolved much faster.

IBM Tech Support requires firmware and drivers to be up to date before they will really help you, so everything I did needed to be done. IBM is now sending a tech onsite to replace the drive and also the tape drive as It still didn’t work in the end. A backup to disk job was configured before going off site.

Till next time!

NS


Dec 3 2008   11:00PM GMT

Symantec Backup Exec - Backup to Disk Issue



Posted by: Nathan Simon
Storage, Microsoft Windows, IT professional

By default a backup to disk folder is setup to have a maximum size for backup-to-disk files of 1GB and a maximum of 100 of these. Well what happens if you have a backup of say 265 GB’s? It will eventually have written 100, 1GB files, thus provoking backup exec to ask for another drive(which may confuse many)

What you do is you double check you Backup-to-Disk Folder Settings under the devices tab.

You will notice the points below

- Maximum size for backup-to-disk files was set to 1GB

- Maximum number of backup sets per backup-to-disk file was 100

Change them to these settings

- set Max size to 100GB (if NTFS; if Fat32 max size will only be 4GB; at which point I would set the max number of backup-to-disk files much higher)

- set Max number to 4

These settings will allow up to 400gb backups after compression, thats about 850gb of Data. Set the 2 above settings to whatever your system can handle, just make sure you account for enough space!


Dec 3 2008   6:06AM GMT

Taking advantage of Roaming Profiles



Posted by: Nathan Simon
Networking, Storage, Microsoft Windows, IT professional

Well one thing about roaming profiles is that they can be used to deploy new workstations more efficiently . If done correctly and taking the proper precautions you can harness the power of the all mighty Roaming Profile!

When utilizing roaming profiles, you save the time of copying favorites, desktop icons, my documents, and outlook settings(usernames, account settings, and autofill settings)

So how I went about doing this on my latest desktop refresh was to enable roaming profiles for each user in Active Directory. MS Technet Article here.
Then I would goto each workstation and use a great free tool called ATF Cleaner (link here) I would clean out the profile, I usually leave the cookies, who knows if they have banking sites or passwords that they want saved!

Once the profile is free of the THOUSANDS of temporary files and temporary internet files, I log off their profile, then log back on again, a subsequent log off will enable the roaming profile. During this second log off the profile gets copied up to the server, the one thing I noticed, was with pop outlook accounts, the PST did not come over. Other then that when logging onto the new system the desktop icons, favorites, my documents, were all there. Everything was intact. Email was a snap, Outlook imported all the settings, created a new PST file(which i didn’t want) So all I did was rename the PST file created and pointed it to the old one(which was copied up to the server.)

Some custom applications were configured and away I went.

I am sure I must have saved at least an hour or work per workstation, you know how some people can be when it comes to new systems, they want everything they had before…

Roaming Profiles are also a good way to keep people from loosing important documents, because we all know, no matter how many times you tell people, they still like to avoid the server drive in favor of My Documents! Little do they know that we are backing up their files nightly ;)

Okay well after a 14 hour day of work I really need some sleep.

Thanks for reading!

NS


Oct 28 2008   4:04AM GMT

Racked My First C3000 Blade Enclosure



Posted by: Nathan Simon
Networking, Storage, Virtualization, IT professional

Well Its done, I officially racked my first HP C3000 Blade Enclosure with an MSA (Modular Smart Array). I tell you when they say you need to follow instructions to rack this beast, they were right… It took 4 techs to carry this 180lbs server, and it wasn’t even fully populated yet! First thing we had to do was remove all power supplies and spacers, then we had to unscrew the actually guts of the Blade enclosure. Once the Enclosure was completely empty, just an outer shell, it took 2 of us to rack it. After it is racked you get to put it back together. Power supplies, MSA Controller, GBE2 interconnects, then the blades themselves. It was quite an exciting time for me, as I have never actually worked with this kind of technology.

The setup follows, I am going to be quick and dirty on the description!

The C3000 Blade Enclosure has 2 Blades, each Blade has a Quad Core 2.333Ghz Processor, 16GB ram, and 2x 72GB (Raid 0) hdds. The C3000 is connected to an MSA using 2x GBE2 Interconnects. There are 2 controllers on the MSA so we were able to create redundant connections from the GBE2 switches to the MSA( just incase one port dies). The GBE2 switches were also redundantly connected to a Cisco Layer 3 switch. All of this plugged into a nice R5500VA HP Rack mounted UPS.

We are going to run ESX Foundation on each Blade, and each Blade will control 600GB LUNS on the MSA. No automatic Vmotion, but at least if a blade drops we can move the VM’s to the other store and boot them up… pretty good Redundancy for an sound price, which is withheld! This is only the beginning!

Anyways I hope I didn’t bore you guys, and I really hope I made SOME sense.

Take care and again have a great night!