By Andre Ramos | Jan 23, 2024

The Unnoticed Threat: Network Alarms and the Overlooked Role of Timing

In a January 2024 conversation with Rob Jodrie, Technical Support Manager at Syncworks, about network timing and the network outage reporting system, Mr. Jodrie delved into the often underestimated realm of network alarms and the peculiar reality that they don’t receive the attention they deserve. Despite the high costs associated with the FCC’s NORS—not to mention the impact on the consumer—network alarms continue to be overlooked. A back study that Mr. Jodrie participated in showed that alarms and power were the two main causes found in the required NORS reports. Alarms signify potential network failures, a critical aspect that most network operators are not in tune with. There’s no real good answer as to why. First, a bit of educational background on the FCC Network Outage Reporting System before we begin. FCC Network Outage Reporting System (NORS)

What is NORS?

In 2004, the FCC implemented outage reporting rules to ensure swift, comprehensive, and accurate information about significant communication service disruptions with potential impacts on homeland security, public health, safety, and the nation’s economic well-being. Communication providers, including wireline, cable, satellite, wireless, interconnected VoIP, and Signaling System 7 providers, must adhere to these rules. Reportable network outages lasting at least 30 minutes trigger mandatory reporting in the Commission’s Network Outage Reporting System (NORS). Data submitted to NORS is considered confidential. Depending on the provider type, notifications must be made within specific timeframes: preliminary information within 120 minutes, an initial outage report within three calendar days, and a final report within 30 days of outage discovery. Interconnected VoIP providers follow a similar process, with variations based on the outage’s impact. Covered 911 service providers, responsible for aggregating and delivering 911 traffic, have specific notification timelines and information-sharing requirements when an outage affecting a 911 call center occurs. Source

FCC Network Outage Reporting System and The Odd Reality of Timing

Mr. Jodrie, reflecting on his time as a Tier 2 Technical Support technician for a major carrier, highlighted the historical neglect of timing in network operations. He shared insights into the FCC reportable events, where outages of certain size, capacity, or duration required reporting to the FCC, often resulting in fines and root cause analysis. Surprisingly, a back study revealed that power and timing were the most common culprits behind these expensive events.

Headshot of Rob Jodrie

Rob Jodrie: We used to have to do write ups on what are called FCC reportable events and those were if you had an outage of either a certain size, capacity, bandwidth was down or of a certain duration, length of time or both, you’d have to some in some cases report that to the FCC and it was not a good thing. Fines would come about and they would ask for root cause analysis. Meaning you got to tell me what was the thing that caused this issue. And then you were supposed to get a plan to say here’s how we addressed it. But point is we did a back study on that at one point in time and we found that the two most common issues causing FCC reportables were power and timing. So it’s really interesting to me that these FCC reportables were not small events and they were very expensive and it just has always surprised me how little attention that it gets.

Telecom Solutions, BITS Clocks and Network Outage Reporting System

The conversation turned towards telecom solution BITS clocks, workhorses that have been running for decades. Dave reminisced about the discreet alarms in the early days, simple contact closures that provided minimal information. Even today, thousands of these systems are active in the U.S., relying on basic alarming methods that necessitate physical inspection to determine the issue.

Rob Jodrie: The interesting thing is in the early days and this is still out here with these in today’s market, these telecom solution bits, clocks which are workhorses, they’ve been running for literally decades. We put our first one in Portland, ME in 1987, I think. But they have what they call discreet alarms, which means it’s a contact closure and all it does is sends a relay event, if you will, to what they call a scan point.

And it just detects is there a short across the line or is there an open and if there’s a short across the line, the words come up and it says “sync major”, “sync minor”, “sync critical.” That’s it. That’s all you know, you don’t know if you’re in hold over, if the whole thing is down, you have zero idea. It’s just a dumb alarm and you would see that alarm and you would have to get a pair of hands out there to stand in front of the box and look at the system and determine based on lights what do we got what’s going on. So get kind of kind of interesting and that is still the case the the next generation of gear thankfully you know is is intelligent right we can log on to it and we can retrieve information from it. We can find out is it in holdover do I have references you know power trouble what’s the story without rolling a truck.

Evolution of Alarm Systems

Despite the persistence of legacy systems, the next generation of gear brings intelligence to the forefront. With the ability to log in remotely, retrieve information, and diagnose issues without dispatching a truck, modern systems have come a long way. However, the conversation acknowledged the vast number of existing systems that still rely on primitive alarming methods.

Rob Jodrie: So it’s it’s come a long way but there still is you know there are thousands and thousands of those telecom solution systems still out and active in the network in the United States. It’s hard to hard to believe but that’s what it is and and sometimes we would actually find that the alarming that piece of wire that would go from the bits clock to the scan point that would report back to the knock center. Sometimes that wire would get ripped out and there was no alarming. So we’d walk into an office look at the bits clock see it’s an alarm call the knock and say why did you not respond to the alarm. We don’t see any alarm so alarming and and the lack of love for timing has been sort of a an interesting thing to note over over these years. Over these years people are becoming more aware of it at times and they usually do when they get you know a bit big time with a with an outage. You know that’s that’s as you know what sync works. We do a lot of that. You know, holy cow, things are down and then we come in and stop the bleeding and then there’s a lot of focus on it and there’s a lot of attention given to it.

How Alarms Manifest Themselves Today

Rob Jodrie: So today you know one of the most common things that that is used is something called SNMP. It’s simple network monitoring protocol I think is what it stands for and it’s an intelligent thing. What what you do is you have a an SMPSNMP server located somewhere centrally and you have to point this intelligent equipment via an IP address scheme back to it. So when you have a device that’s intelligent and some problem comes in where I’ve lost my GPS, it then sends an intelligent piece of information back to this SNMP server and then that like translates it and comes up and says this piece of equipment at this location has this issue and now somebody needs to respond to that. So you just have more to go on rather than sync major. Now. issue like “I’m in holdover, I’ve lost GPS” are noted in the alarm logs so you can go back and say when did this happen, you know, has this been a week, two weeks, you know what, what events, There’s just more information you can collect, but it’s all done usually via SNMP.

The Lack of Love for Timing

Rob emphasized the historical lack of attention to timing issues and how it only gains awareness when a significant outage occurs. Syncworks often steps in during critical moments, addressing network failures and drawing attention to the importance of timing in those scenarios.

Conclusion

The discussion concluded with the recognition that timing, despite its critical role, has often been overlooked. As networks evolve and awareness grows, the hope is that the industry will prioritize timing to prevent costly outages and ensure the stability of critical systems.

CUSTOM SUPPORT

Keep your network running smoothly with SyncCare — expert support, proactive maintenance, and fast issue resolution for your timing infrastructure. TEST