There are days you just wish you had not got up early. Wednesday morning I had a site that needed updated with Windows updates so I was around about 2:30 and started updating all the servers. This was the only time of day we could do this so I got them all done by 5:30 or so and I checked OWA and the TS Gateway and the different programs and it all looked good. So I thought anyway about 8 we got the call that users could not access their email or user files. I popped in and looked around and it looked good from the server perspective anyways.
So they sent someone on-site and they couldn’t find anything so they started a Server down call with MS and worked on this much of the day and into the night. I got a email from the tech on-site about 9 wondering if I could relieve him at midnight if nothing had changed. I checked and nothing had changed so I headed on-site and got their a little after midnight and the MS guy was dwelling on the networking of the server and I watched for while and I could see one of the NICs was showing as failed which when I looked more it was actually the Network Team that was broken. We got the HP updates and updated the server with the latest and greatest updates and then rebooted the server still no go so I suggested removing the team which the MS guy thought would be a good deal. We removed this and then set the right IP on one of the adapters and disabled the other rebooted the server and life was good. We got the virtual servers back online and everything working. Moral to the story is make sure you ahve your HP Server updated with the newest updates as some of the MS Updates are not compatible with older drivers and firmware and such.
Til later just Roger
I know I keep promising to finish this series and I will get to it here it has just been a little crazy around here. March and Feb is a big month for birthdays around the house and this last weekend was finally the last one for a while. It was my wife’s youngest sons 7 Bday and of course it turned into a Friends sleep over on Friday night. 5 friends all the ages of 6 and 7 it was a little noisy around here. Then Saturday we took those guys home and I went and picked up my oldest Grandson who is 7 and we went to the Harlem Globetrotters in Omaha. That was fun and they are always entertaining. Yesterday I spent on Windows Server 2008 R2 and exploring that and there will be more coming on that also. So hang tight be patient and I will get these out now. Plus I decided to load Windows 2008 R2 on my test HyperV Server as the Host OS. Nice calm weekend around here it was.
Til Later just Roger
Have you checked out Server 2008 R2 Beta yet? I downloaded it today and have been working on getting it installed. There are 2 things that I am interested in with this build and that is the new features for HyperV and the new features available for the Terminal Services or as MS now calls it Remote Desktop Services. It looks like they have tied some of the TS Stuff in with the new Windows 7 client OS so this will be cool to see where that goes as I have clients now using the TS gateway and the TS Remote Apps.
Also I am intrested in the Live Migration in hyperV as we are moving into this heavily with some of our customers should be interesting to see where this all goes for sure. Back to some more playing with the Public Beta I downloaded and will let you know what I think.
Til later just Roger
I know I have been talking about the project I have going on at one customer site. We got a HP c7000 and in this we have 2 BL 480c Blades with dual QC CPU’s 48 Gig RAM, 4 NICs and 4 146–HDs and we also have a HP MSA 2012i iSCSI SAN device with 12 450 Gig 10k SAS Drives in it and we had 2 GBE2c switches in the c7000 for the interconnects.
One of the goals from the customer was to have complete failover for the networking as we will be running Virtual Servers on the 2008 Servers on the host machines the BL 480c’s. So with help from Jason one of our network guys and fellow blogger on this site we got the failover working. I have to admit I am not the greatest with networking so his help was much appreciated on this project. He has been in the process at this site of moving in a new core network switching using the HP Blade Switches is what I will call them but I think they are PROCURVE 5400ZL Switches.
How we ended up doing it was pretty simple as we took the GBE2 switches out and put HP passthrough modules in the c7000. When we would down one of the GBE2c switches the failover would not work as Jason thought the response of the switches was not fast enough. But we used HP Teaming on 2 of the NICs for the HyperV networking and we teamed the other 2 NICs for the SAN networking that would be handling the iSCSI traffic to the MSA. We had spread the MSA connections and the team connections across multiple switches on the HP Switch Blade. We had 2 Virtual Servers running on each Blade Server and when I downed the MSA Array Controller the servers keep working as the other Array picked the failed Array of the MSA on the fly and then Jason pulled a switch and everything just keep working because of the NIC Teaming. I know this in not detailed a lot but it was all pretty simple actually and not that complicated once you get to the end result. But it is working.
Til later just Roger
I had a customer call that had a server that they rebooted and it never came back. Ever been there before? I know we all have many times and here lately it has been because of older hardware and the motherboard just giving up the Ghost. Well i got on-site and yip it was dead and now what to do. As the servers we are using here now are SAS Drive Type HP Servers and this was a older SCSI drive. OK I looked around and found a ML 350 that we have replaced but we had not did anything with the server and I got the server to boot with one of the drives out of the failed server running as a spare.
I then used my favorite program Shadow Protect IT version and backed up the disk that had the C Drive and the E Drive on it. It had been a mirrored set so the disk was running in Dynamic Disk mode. Oh Goody I am so excited. So I looked around for something to restore onto and the pickings was slim for that but we are running HyperV at this site so I restored the server into a Virtual Server session. Once this completed I thought ok here we go rebooted the server and nothing. Just a blank screen with a cursor in the left hand side of the screen. OK now what the heck scratched my head went and got a Pepsi and checked the Shadow Protect website. Sure enough there was the fix I needed. What I had to do was boot with a 2003 Server CD hit R and go into the recovery Console and once logged in I type Fixboot C: and then fixmbr rebooted and away we went. Pepsi works every-time OK most every-time but sounds good with this one.
Tel later just Roger
Friday night I and another one of our techs moved a Server running SBS 2003 to new hardware. it use to be we would do some kind of swing method to do this but on this one I thought why not lets try the Shadow Protect method on this one that I have been using for Physical to Virtual migrations.
This customer also had a series of bad luck with their servers from lightning strikes to other misfortunes and the current pain was space on the server. They are accounts so they keep everything and have started to scan images. So when i first proposed this to them they was very doubtful and I know one of the partners made a comment that “it can’t be that easy” well yes it was and yes it is. I also had a time window of 6 PM Friday evening to 6 AM Saturday morning because it is that time of year for them being it is tax season and all. Sure we had one hiccup but nothing that was life altering or effected the server any.
I would suggest checking out the product and seeing where it fits on your bag of tricks but this one I am liking more and more everyday. Here is what we did basically
I have a trial copy of Shadow Protect IT version that I used for this and I booted the old server to the disk and selected the 2003 option and when it came up and asked about starting networking I just canceled. I had also plugged in by 500 Gig USB Drive that I had borrowed from my wife’s business for this purpose before I booted. But once I had booted to the disk I selected backup and specified the USB drive to backup to and the 3 drives on the server we was running out of room on. The backup took probably a hour to hour and half to complete. I then downloaded the RAID Driver for the HP ML 350 G5 we was rolling in and extracted the files to a folder on the disk. I then booted the new server defined the RAID level and booted to the Shadow Protect disk using the same options and answers from the old server backup. I then set my partitions for the 3 drive sets and started the restore. Because we was going to new hardware I had to select the HIR restore and specify the folder that has the RAID drivers in it. Well the actual restore of the C drive took about 15 minutes but I noticed it did not inject the RAID drivers. Been down this path before so I did the C Drive again and selected the HIR option again but this time instead of low matches I selected Excellent meaning it was going to prompt me for the drivers for the 46 different parts it don’t reconize. So I skip my way to number 43 which is the RAID driver and it loaded the driver and we went to restoring the rest of the server. This all completed in about 4 hour or so and when the server booted from the restore I had to go in and fix the drive letters on the E and S drive but that was it. I always boot into Safe Mode with networking anyways and load the HP Drivers for all devices and make sure the drive letters are what they need to be.
We got that all done and had the server up and running and called the customer at midnight and had them check the server from home before we left but they connected with RWW and it was that easy things just worked. I called and check on the customer on Saturday morning and things was humming along like they should be and I think they still couldn’t believe it was that easy. I know your thinking why not just migrate them to SBS 2008 a lot of that was the reason we did not is because of the programs the have running on the server right now do not support 2008 or 64 bit or I should say there LOB is not approved to run on it. But customer was happy was the bottom line of the whole adventure.
Til later just Roger
Ok I thought why not today and updated my Vista Laptop to IE 8 install went pretty well but the big thing to me what will it do you my connections to SBS 2003 and SBS 2008 RWW pages. Not much I got the warning about the TS Active X and went and checked add ons and it was enabled there. So I went and added the sites to the trusted sites in IE security and away we went.
Til later just Roger
Well we had a DC die on us Friday in our office and this was also the FSMO Master of the domain plus it had the Enterprise CA on it. The DC was brought up Virtually on our BDR Device but in the whole process the DC went into USN Rollback meaning the version of AD on the DC was different or a older version than the rest of the DC’s in the domain but things seemed to be working but they was not right. Not good here well I got those emails from our Network Admin that something was amiss in the domain and could I look at it.
So as I dug through the problems in the event logs on the DC in question and on the other DC’s I keep finding that the error was that the servers was not replicating. I also found when I run the repadmin /showrep I get the error of IS_GC DISABLE_INBOUND_REPL DISABLE_OUTBOUND_REPL and also when you ran the dcdiag command you got the same thing and this had the commands to run to try and correct the problem which I ran. But it was not long and this error was showing up again when running the commands. I also found the netlogon service at a paused state this is another sign on USN Rollback problems. Basically all the DC’s had went into a mode of not allowing the replication to happen because of the old data from the bad DC.
I searched and finally came to the conclusion that we was going to have to demote the server down and bring it back on the domain as a member server. I called Randy and gave the bad news to him and just what he wanted to hear on a Saturday evening at 10 and we came up with a plan. We would use the ntdsutil to clean AD on the DC’s to seize the rolls and then get AD cleaned of the bad DC. I had went changed the main DNS IP on a lot of the servers in the network to the soon to be new FSMO master and then verified DNS was still working.
Randy had restored the server on to a physical machine and had it shutdown. We paused the Virtual DC with issues and then brought up the restored server on the physical machine with it plugged into a switch by itself and did a dcpromo /forceremoval to clean AD off the server. As this was being done I seized the roles onto the new FSMO master server and cleaned AD Sites and Services of the bad DC and also cleaned up any remnants of the bad DC out of DNS. This got AD straightened out and replicating to all sites like they should be and DNS functioning the way it should be.
When restoring a DC either into a Virtual Environment or on a Physical machine there is some steps you need to do before you bring it back online in the domain. Here they are this holds true for a Virtual Server or a Physical Server as we move more towards Virtual Servers this is something that really needs to be watch or you will run into this
Procedure for using the recovery option:
- “Restore” the image
- !!! Boot into DSRM !!! (not connected to the network)
- Note the value of “DSA Previous Restore Count”
(HKLM\System\CurrentControlSet\Services\NTDS\Parameters) (Not visible? –> Assume value of 0)
- Add the entry “Database restored from backup” (DWORD) with a value of 1
(HKLM\System\CurrentControlSet\Services\NTDS\Parameters) (This triggers the actions needed for AD right after a system state restore!)
- Stop the “File Replication Service (NTFRS)” and assign the value “D4” (for auth. or primary restore) or “D2” (for an non-auth. restore) to the entry “BurFlags” in (HKLM\CurrentControlSet\Services\NtFrs\Parameters\Backup/Restore\Process at Startup)
(This triggers the actions needed for the SYSVOL right after a system state restore!) (and other replicated DFS namespaces!)
(also see: Using the BurFlags registry key to reinitialize File Replication Service replica sets – http://support.microsoft.com/?id=290762)
- Boot into normal DC mode (not connected to the network)
- Check the value of “DSA Previous Restore Count”
(HKLM\System\CurrentControlSet\Services\NTDS\Parameters) (New value = old value + 1)
- In the DS event log check for event ID 1109
- In the FRS event log check for event ID 13565 & 13520 if a non-auth. restore was performed for the SYSVOL
- In the FRS event log check for event ID 13566 if an auth. restore was performed for the SYSVOL
- Connect to the network again
- Check the health of the DC (AD & SYSVOL)
- DCDIAG /D /C /V
- NETDIAG /DEBUG /V
- GPOTOOL.EXE /CHECKACL /VERBOSE
- REPADMIN.EXE /SHOWUTDVEC <FQDN DC> <NC>
More on the Enterprise CA next and what we had to do to bring that back.
Til later just Roger
You hear the words Network Admin and sometimes that brings visions of being locked down and limited on what you can do on the PC you work on or places you can go on the web. There is a reason things are locked down or you are limted on where you can go. But there is also the side of the Network Admin you never see there is the side of them where things goes wrong on the servers or server and the work they have to do to keep things running while you are home enjoying your weekend or your evening with the family.
So when you come to work and you login and click on your LOB or whatever it is you have to work with it just works. I don’t know how many times I have been out on site helping a client and I ask them what the problem is. usually you get I click here and then I click here and it works. Plain and simple but in reality it is not so simple when troubleshooting a problem.
Moral to the story is you might think the network admin is the guy that limits what you can do or go on the web but he is also the guy that because of the position he has gives up evenings and weekends at times to make sure when you come to work things just work. So next time you think of the network admin think of what they do and have to give up at times. I work with one such Network Admin that does a lot of this kind of work that maybe is not noticed and when he asks for help I am more than willing to drop what I have going on and help out with whatever is going on. Problem is I enjoy my job way too much and new problems is what keeps work very interesting and always a learning process.
Til later just Roger
Well as stated in the previous post we had a 2003 DC that was the FSMO Master and also the Enterprise CA for the domain die on us. When we had the Virtual DC running I did a back of the CA to another folder on another server. You also need to backup the registry key but more on that later. We got the bad DC demoted and once I had the DC’s back to talking we brought the Physical Server that had the FSMO Roles on it that we had demoted down to a member server back up and rejoined the domain with the server and once we was back to the desktop we installed the Enterprise CA and then restored the server using the CA backup I had run earlier.
When we tried starting the CA it had the error “Certificate Services did not start: Could not load or verify the current CA Certificate. MyDomain Root CA Bad Key” error. Ok now what well I dug some more and found we should have also exported the “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\CertSvc\Configuration\”mydomain Root CA” registry key. We went back to the Virtual DC and made sure it was not connected to the domain and exported this key and then got it moved over to the server we had this on. Did a import of the reg key and the service then started and away we went life was good and happy dance time.
Til later just Roger