Servers preventing NEW connections from remote clients

Active Directory
Microsoft Windows
Hey all, I have an interesting problem and hope that someone can help resolve it. At the minimum, I have to reboot one or more of my servers at the main site because all of my remote users located across a direct T1/VPN tunnel can no longer get connections to the server with the exception of being able to ping it. They cannot establish RDP, RPC, NetBios, etc. So far, the only way we have been able to restore connectivity is to reboot the server and then everything starts working once again. It has been this way ever since we established the remote office. It's also not all servers at once. For example, sometimes it's the Exchange server (2003), sometimes it's the file server or one of the DCs. Here's the network specs: All servers are Windows 2003 with Windows XP Pro clients. The domain is currently running in Windows 2003 native mode. We are using Exchange 2003 with Office 2003 on the clients. The entire domain was established from scratch as Windows 2003 Servers (no silly upgrades). The only thing that was migrated was the data. All of the switches/routers are also new. We use Cisco 4500 switches for the core and 2950/3750Gs for the access switches. The link is run over Cisco 1841 routers. All of them are running the most up to date IOS available. Links are running WFQ. The link traffic fluctuates greatly, from 5% utilization up to around 90% when someone is transferring a large file or downloading something from the Internet (all Internet access is via the main site/proxy/firewall infrastructure). However, the link utilization doesn't appear to negatively effect the problem. While they are unable to access one of the servers, they still can get to others on this side of the link... A look through both the logs on the clients and servers shows nothing considered abnormal. Sniffing the network shows somewhat normal traffic, however, when the problem is occurring, it seems that the TCP (RPC) continued sessions are more excessive than normal. I have not executed an RPC ping from the clients to the server that users cannot connect to yet, but am planning on it the next time it happens. The interesting part is that the server that the user cannot get to is saying that (output from netstat) it has established sessions from the other end of the link. I think I'm going to have to do a lot more sniffing to figure out this problem, but am hoping that someone has seen this problem before. I've been searching up and down on Google and MS to see if someone has had this before. However, I've come up with nothing except for an MS article discussing almost the exactly opposite fix action (reboot the client, not the server). Thanks for your help, Don

Answer Wiki

Thanks. We'll let you know when a new response is added.

Well while you are waiting for one to stop working use your sniffer and baseline some of those sessions, so capture a RDP set etc so that when you have the problem you can capture another set and compare. At the remote end do you have a DC? In your DC do you have the server set up with the branch info in sites and services. Make sure all your information for your servers are in your AD scheme cause your XP boxes go there first for everything. I love sniffers its a great tool. If you loose connection but only on the remote side to a server find out if the traffic is getting to the server, and then make sure the server knows how to get it back. you could have a routing entry that is being lost and a reboot clears that table and when rebooted it is fixed. Maybe even go with entering persistant routes on your servers for your remote office. If you are a small operation that is very smart and easy to manage.

Discuss This Question: 5  Replies

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.
  • CheckSix
    The issue is a known one with SP1 on the Servers and the Cisco routers. Hotfix is available. See this link: Regards, CheckSix
    15 pointsBadges:
  • Sonyfreek
    Boardinhawk: Thanks for the information. I have been baselining the systems during normal working as well as non-working periods. I've also used jpcap to map the connections out graphically. Unfortunately, I haven't seen anything that sticks out significantly except for the rdp sessions. I have a DC at both sites, each is a GC in the domain. I also have it setup in sites and services with the appropriate subnets assigned and set the DCs as IP bridgeheads (I think that's the terminology that MS uses). All of the routing is static on the network (since there's a small number of layer3 switches and routers). We definately have IP connectivity, as the pinging works when nothing else does. The remote office has the default routing entries as the remote end of the tunnel. I haven't tried entering the routes locally on the remote server, though. CheckSix: The article appears to hit the nail on the head. Obviously with the tunnel, the MTU sizes differ across the VPN. At one point, I actually disabled the VPN totally to see if it helped, but we were still getting the problems. Still, I think this might be the solution and I can't wait to try it out next week. Thanks a ton for the information! I'll let you know if it fixes it. My only problem with it is that we stay on top of patches, so we may already have MS05-019 and MS06-007 on the server. I'm still praying that it works or possibly reapplying them fixes the problem. Again, thanks guys, Don
    0 pointsBadges:
  • Boardinhank
    so if you are in your remote office sometime during the day you can no longer do what? make a remote desktop connection to the main office? send email? what exactly cant you do when you notice the problem that makes you reboot the remote server? if you have a domain controller on each side your requests for access to anything should be approved by your local server and then passed to the remote side? make sure both DC's have static routes, check that they are not replicating during the day and causeing problems. your rebooting of the remote server may cause the servers to talk again and fix the problem. so if your client can talk to the remote side make sure the remote servers are talking as well when you experience the problem.
    60 pointsBadges:
  • Astronomer
    I didn't see if you have more than one domain or not but you said the domain controller on each site is a global catalog server. If you have more than one domain then make sure you have a global catalog server separate from the infrastructure master. Here is what microsoft says. Infrastructure FSMO Role When an object in one domain is referenced by another object in another domain, it represents the reference by the GUID, the SID (for references to security principals), and the DN of the object being referenced. The infrastructure FSMO role holder is the DC responsible for updating an object's SID and distinguished name in a cross-domain object reference. NOTE: The Infrastructure Master (IM) role should be held by a domain controller that is not a Global Catalog server(GC). If the Infrastructure Master runs on a Global Catalog server it will stop updating object information because it does not contain any references to objects that it does not hold. This is because a Global Catalog server holds a partial replica of every object in the forest. As a result, cross-domain object references in that domain will not be updated and a warning to that effect will be logged on that DC's event log. If all the domain controllers in a domain also host the global catalog, all the domain controllers have the current data, and it is not important which domain controller holds the infrastructure master role. This may not apply, but if it does, I would check it out. rt
    15 pointsBadges:
  • Swiftd
    Boardinhawk: It depends on the server on what you cannot access. Sometimes it's the Intranet server, in which you cannot connect to the web site. Sometimes it's the Exchange Server, and you cannot get your emails. Other times it's the file server and you cannot access the shares on the server. Every time it's any of the servers, you are unable to use RDP to connect from either a client or a server to the Terminal server of the affected server. We have DCs with a GC on each end of the connection. They are configured using Sites and Services as the main office (2 DCs) and remote office (1 DC). There is an IP bridgehead server configured for each of the sites. The servers are set to replicate twice every hour (default). Since the T1 was not over burdened, I decided to keep it at the default unless I started to see that it was overwhelming the link. We have rather strict account lockout policies (3 attempts, lockout forever) that won't be replicated in case someone gets locked out. I know we can use forced replication to solve that also. One would think that the local DC would handle the server requests, but I believe it to be at the RPC level and not in the user authentication. The servers do not have static IP routes. I could add it, but let me make sure that we are talking about the same thing: Are you referring to putting in a host entry in the host table, using something like route add, or something else? I've just installed MS06-007 on all of the affected servers today and will wait to see if they fail again this week. I had the previous update installed on all of the servers, yet was still getting the problems last week. Thanks, Don
    0 pointsBadges:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

Thanks! We'll email you when relevant content is added and updated.


Share this item with your network: