WAN Leased line failure study?

Leased line management
Leased line services
Network monitoring
Does anybody know of a study that lists failure probabilities for leased lines? I just had one of my offices in San Francisco dropped off line for a week because the two T-1's going into the office failed when the manhole they run through flooded. I need to come up with fault tolerance plans for my wide area network, and as part of that I need modes of failure, and the probabilities of failure for the modes. But I can't find any studies, just personal opinions and experiences.

Answer Wiki

Thanks. We'll let you know when a new response is added.

I can’t offer any studies either, but what I can tell you is what the organizations do who are dead serious about redundancy of connectivity.

1) Get multiple connections using different carriers
2) Look into having each connection use different technology (DSL vs. Cable, vs. T-1, vs. Satellite, vs. Wireless. Copper vs. Fiber, etc.)
3) Make sure that the circuits in question come into your facilities via different physical routes. In some cases, this has meant trenching to go under a different side of the property or installing poles to go over a different side)

The key point here is diversity in as many dimensions as you can get (or afford). At my home for example (because I have a home-based business), I have both DSL and Cable. While each has its own drawbacks and advantages, since the vendors, technology and media are different, the chances that both will fail simultaneously or for the same reasons are kept to a bare minimum. Similarly, even the firewalls are different (SonicWall vs. Cisco PIX).

Some instances based on personal experience (yes, I know that’s not what you were hoping for) also indicate that since there is always some percentage of installers who are sloppy, or less than diligent, or worse, incompetent, that you perform (or hire) a complete physical audit of your networks looking for ANYTHING which is not as correctly done as possible. This includes loose terminations, equipment not on UPS (or at least good surge suppressors), equipment not properly mounted, any cabling not properly dressed (Eek! A naked cable), any cabling and equipment not properly labeled (type, vendor, purpose, etc.).

While there might well be such studies out there, my experience has been that careful planning and diligence in execution, and redundancy will pay far greater rewards than learning what the overall industry failure rates might be.

In your case, (as mentioned above) I’d see about finding out if there is any way of bringing in that second T-1 by a different physical route which avoids the manhole.

I know you wanted a study, not experiences, but I think what you might look for are documents on “Best Practices”. Otherwise there are just too many variables out there.


Discuss This Question: 7  Replies

There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when members answer or reply to this question.
  • Bobkberg
    I just remembered the other factor that should be included. What is the cost of failure? That is - How much money or business is lost because "Facility A" is not available? In many cases, management does not examine the cost of failure because it hasn't happened often enough or for an extended period of time. But there is a definite cost of not being able to conduct business, and that needs to be compared with the costs of maintaining a redundant capability. Bob
    1,070 pointsBadges:
  • Atomas
    The previous answers are excellent. Probably you can check in your risk analysis study if it is worth it to proceed with those, check how much it costs... and how much you lose when both T1 are down for a week. But I believe if you have 2 T1, it's because you need redundancy. Get a serious talk with your Service Provider. Also make sure you have a DRP plan and you can have your network audited. You can read ISO/IEC 17799 (section 9.2.2) that states: "Telecommunications equipement should be connected...by at least two diverse routes to prevent failure..." Dan CISSP, CISA
    0 pointsBadges:
  • HumbleNetAdmin
    Bigshybear Great answers here, except of course no one is providing you with any info on studies. Unfortunately I can do that either. I can back the previous posters comments however and talk about my infrastructure. As atomas suggested, an analysis on what it would cost and how it effects business to be down would be great start. Seems you alread have some data for that since you experienced a weeks outage due to both of your T1's failing. Redundancy is a key factor in staying up. I am the network admin of a company thats doors are open 8-5, however is in business 24x7 365. We can not tolerate our internet connection being down more than 30 minutes due to the web, terminal and ftp services we provide. There I have everything redundant. We have a generator, and UPS back (so that everything does not die if we switch to generator and to proctect against surges and brownouts). On the internet connection; I have a T1 over copper (comes to us in ground) and 3mb ethernet over fiber circuit (comes to us aerial). Each has a different ISP and are out of different regions of the state. ISP over fiber is out of St Louis and the T1 is out of Kanasas City, We are right smack dab in the middle of those and we have EBGP (external BGP, ISP's handle the BGPing) between the two ISP's so that if one fails, traffic will route to the other. The two connections come into one CISCO router (I have a second identical cold spare in the rack with the hot router for redundancy) and then to the firewall which has a hot spare that will automaticaly failover. Then the connection to our core switch (which has a cold spare), I have cold spares to cover the failure of any my switches. We have redundancy for our webservers and terminal servers as well and this is accomplished by a load balancer with hot automaticaly failover spare and multiple webservers and terminal servers. I am now getting ready to split the two internet connections to seperate routers and firwalls and will do IBGP (my two routers will do the BGPing inernaly between each other), this will give us better control over what connection our incomming traffic favors (currently it favors the T1 over the fiber), it will also give us more control on what circuit our outgoing traffic is using. I am also going to implement a 2nd fiber circuit coming in, although it will come over the same fiber transport as our current fiber, the ISP has two internet feeds, one out of St. Louis and another out of a 3rd region (springfield, Mo.). Primary reason for the 2nd fiber connection is to route all intercompany internet traffic, to keep it of the circuits that our customers use. AS previous poster commented there are two many varialbes out their that can take down your internet connection. Year before last when we were only on the T1 and just as we were implementing the fiber, the cable that our T1 comes in on was dug up and cut knocking out our internet for four hours. If the fiber circuit had of been live at that time, our traffic would have switched automaticaly over. After repairing the cut, we started having intermitant circuit drops especialy during stormy weather and steadily got worse over time. It took a year to finaly get the T1 pairs changed and that only happened after complaint after complaint when it realy got bad and the cicuit finaly went hard down while I had the Telco on the phone looking at it when it went down. So to me, setting down and asking the powers to be some questions. Just how important is that connection to the internet for your employer, can they tolerate some down time, or like us, no down time. How will it effect busines if we are down half an hour, 1 hour 2 hours 4 hours half a day a day more than a day Things you should consider when looking at redundant internet connections. Having each connection coming from seperate physical sources (my T1 is copper in ground, fiber is aerial), if possible I would try to not to have both connections coming from that manhole Do the two connections have a common link some where that could take both down at the same time (regionly disparate) (our ISP over fiber is out of St.Louis, T1 is out of KC, and the new fiber circuit will be over the fiber, however the ISP has internet feeds out of St Louis and Springfield and the fiber is a humongo SONET ring. So the likely hood of both circuits going down at the same time is very small. A major disaster in our area could do it, and we are looking at co-locating equipment to a hardened facility for just such and extreme case, why? because our building buring down or being taken out by a tornado could possbily put them out of business. ISP and transport providers are well established and known names (not long after I took this position, the ISP we had our circuit on (was a small company)went belly up and we had to aquire another provider. Well, I am by far not an expert, however I hope my 2 cents worth may be of use non the less. Great day to you The Humble Netadmi
    0 pointsBadges:
  • BlueKnight
    You may be able to find an applicable study if you Google something like "line failure probability". I tackled it a few different ways and the better result set (1,470,000+) came back using "failure+probability" then searching within results for "line+study." AT&T did such a study on network equipment. From what I've seen, most studys are outside the U.S. with some hits in the UK and Canada. The posts by bobkberg and others are excellent but let me add a note on something HumbleNetAdmin alluded to... If you decide to adopt alternate lines using carriers other than the one you currently utilize, make sure the new carrier does not lease its lines from your current carrier. With all the competition in the communications market, many carriers lease lines from larger carriers so they can be more competitive. The last thing you want to do is spend money for an "alternate path" and have it go down along with the primary because it actually is part of the primary carrier's network. There are some good stories out there, and it can get ugly if you don't do a little "homework" on the prospective alternate carrier. Good luck, Jim
    10 pointsBadges:
  • Paul144hart
    bobkberg has all the right stuff. I think the bottom line isn't a study, but a contractual agreement with the provider(s). Becuase different LECs will be different from one part of the country to another, you need to do two things: 1 - ask for penalties if they exceed an agreed upon downtime 2 - demand to audit how they provide reliability / availibity. Then the study doesn't matter, other than as education for you when you audit them.
    0 pointsBadges:
  • Padapa
    As a regulated service, a T1 has a 99.7% uptime guarantee. That equates to ~26hours of downtime per year, but is actually measured on a rolling 30 day window by most LECs. Many others have pointed out the need to route diversity or service redundancy. Just remember, all wired services will come through the same LEC, since they own the wire. If you have line of sight, maybe consider a wireless backup link.
    0 pointsBadges:
  • Paul144hart
    To the T1 97% note - good poin, a microwave connection would skip over the last mile failure point. And even there is a gaurantee of uptime, when someone destroys telephony lines with construction work, the gaurantee doesn't get things working faster.
    0 pointsBadges:

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

To follow this tag...

There was an error processing your information. Please try again later.

Thanks! We'll email you when relevant content is added and updated.


Share this item with your network: