Posted by: Troy Tate
administration, Cisco, DataCenter, design, diagnostics, IP telephony, PSTN, risk, unified communications, VoIP, vulnerability
As you may have seen in some of my previous posts the company I work for has implemented VOIP/IP telephony at some of our locations.
Recently we had a phone system outage at the largest of these sites. This was a site with a clustered Cisco CallManager solution. This outage lasted 4+ hours. We were definitely surprised that both members of the cluster failed at the same time and how long it took to recover. Since that time we obviously are working with our support vendor to find a better method of providing uptime to the phone system at this site. I am also looking at making sure my other sites are prepared in the event of a similar outage.
The solution for providing a backup to the CallManager cluster is called Survivable Remote System Telephony (SRST). Think of this as CallManager light. A limited number of the phones still have connectivity and can make/receive calls. I say “limited” because the SRST function is dependent on the PSTN gateway hardware. A larger gateway can support more users. The current gateway we had was a Cisco 2821 series router. This would support 96 users. A Cisco 3825 will support 175 users.
One thing I understand though is that you cannot necessarily specify which phones will get serviced by SRST. The phones are serviced on a first-come-first-served basis. This could be an issue if there are phones that should be serviced and an outage is occurring. Unneeded phones would need to be disconnected from the network to provide capacity to support the critical phones.
Hopefully this will be the last of 4+ hour outage for the phone systems at this site and none will happen at my others. The Cisco solution has been very good for my organization and so far has been very reliable with the exception reported here.
Thanks for continuing to read my blog and hope you have a great day on the technology frontier wherever that may be for you!