I have an interesting problem and hope that someone can help resolve it. At the minimum, I have to reboot one or more of my servers at the main site because all of my remote users located across a direct T1/VPN tunnel can no longer get connections to the server with the exception of being able to ping it. They cannot establish RDP, RPC, NetBios, etc. So far, the only way we have been able to restore connectivity is to reboot the server and then everything starts working once again. It has been this way ever since we established the remote office. It's also not all servers at once. For example, sometimes it's the Exchange server (2003), sometimes it's the file server or one of the DCs.
Here's the network specs:
All servers are Windows 2003 with Windows XP Pro clients. The domain is currently running in Windows 2003 native mode. We are using Exchange 2003 with Office 2003 on the clients. The entire domain was established from scratch as Windows 2003 Servers (no silly upgrades). The only thing that was migrated was the data. All of the switches/routers are also new. We use Cisco 4500 switches for the core and 2950/3750Gs for the access switches. The link is run over Cisco 1841 routers. All of them are running the most up to date IOS available. Links are running WFQ.
The link traffic fluctuates greatly, from 5% utilization up to around 90% when someone is transferring a large file or downloading something from the Internet (all Internet access is via the main site/proxy/firewall infrastructure). However, the link utilization doesn't appear to negatively effect the problem. While they are unable to access one of the servers, they still can get to others on this side of the link...
A look through both the logs on the clients and servers shows nothing considered abnormal. Sniffing the network shows somewhat normal traffic, however, when the problem is occurring, it seems that the TCP (RPC) continued sessions are more excessive than normal. I have not executed an RPC ping from the clients to the server that users cannot connect to yet, but am planning on it the next time it happens. The interesting part is that the server that the user cannot get to is saying that (output from netstat) it has established sessions from the other end of the link.
I think I'm going to have to do a lot more sniffing to figure out this problem, but am hoping that someone has seen this problem before. I've been searching up and down on Google and MS to see if someone has had this before. However, I've come up with nothing except for an MS article discussing almost the exactly opposite fix action (reboot the client, not the server).
Thanks for your help,