A small manufacturing firm specializing in small jet engines & parts (I’d love to have one of their 450 hp turbines in my car!) called to say that their network was “going up and down.” The owner was frantic and believed he had been hacked. The problem seemed to occur in the same time window every afternoon. When I and another engineers went out the next morning (thinking we would scan and clean any malware before the attackers accessed the system), the network was fine; all of the servers and PCs were up and responding. Malware scans found nothing–no viruses, trojans, rootkits or spambots. I told the owner that I believed he was clean and his network secure.
He didn’t believe me. He made me stay until the problem surfaced.
Sure enough, later that day, the gremlins appeared. Every XP machine would either get “Network Cable Unplugged” or “This connection has limited connectivity” messages. Same thing on the servers. A minute or so later, they’d re-establish connection and be fine for a few minutes only to repeat the same sequence over and over again. We watched this for an hour or so.
We figured it had to be a problem with the 3Com switch, so we put in a known-good spare and left it. Didn’t work. Same thing kept happening. It didn’t make sense that anything else could be responsible, except maybe for new manufacturing machines that were recently installed in the shop. Power surges from that equipment could be causing problems. So, we checked the line monitors and there were no obvious problems. We were off to the races.
I went into the system event logs on the servers and found hundreds of warnings and information entries that went “link down”/”link up,” many of them in the overnight hours. This being an industrial area, I began to consider dirty power and brownouts on the power grid as the source of the problem.
But they had a battery backup unit in place, so that should handle brownouts and filter any noise on the AC current. On a hunch, I went up and pulled the plug on the UPS just to make sure it was doing its job.
The network went down. Problem solved. Turned out to be a faulty UPS that wasn’t reporting itself as faulty.
Problem solved. Owner relieved. Network is still secure.