by Steve Douglas, Head of Market Strategy, Spirent Communications
Imagine a big storm just passed through your neighborhood, and you’re worried it might have damaged your roof. Do you:
a. Go investigate how the roof is holding up
b. Wait until the next storm to see if water starts dripping into your bedroom
Most of us would choose option A. It just makes sense to try to determine if something is broken ahead of time, before it fails and creates a bigger, more expensive problem. And yet, that’s not the approach most service providers use today when monitoring their networks.
Modern active monitoring technologies let operators poke and prod their networks in a variety of ways to spot potential problems before they affect customers. But only a fraction of service providers actually use them across the end-to-end network. Instead, most rely on the same techniques they’ve used for years: passively collecting telemetry data, analyzing it over time, detecting many problems only when someone calls to complain.
This has never been an ideal situation, but in the near future, it will be an impossible one. As service providers progress with 5G rollouts, passive monitoring strategies fall apart. Which means, active testing and assurance is no longer optional. It’s becoming a mission-critical requirement.
Problems with Traditional Testing
Historically, most operators have relied on passive monitoring to assess network health, isolate faults, and ensure they live up to their service-level agreements (SLAs). That is, they deploy passive probes throughout their environment to capture network traffic data, dump that data into huge data lakes, and run analytics on it to identify anomalies. Active monitoring takes a more proactive approach. Instead of waiting for statistical analysis to reveal issues over weeks or months, it continually injects synthetic traffic into the network to measure performance in real time.
Active monitoring is not a new concept. Many operators use it today in transport networks, where they’ve been seeking to introduce self-healing and automation capabilities. In the heart of most networks though, the passive approach still dominates. Now, that’s starting to change in response to five big trends:
- Cloudification: To enable more agility and automation, operators are implementing more of the network as software, hosted in cloud environments. As a result, network elements are no longer static, rigid functions. They’re dynamic pieces of software that can be continually spun up and moved across cloud environments.
- Openness: The 5G specification mandates open interfaces. This allows operators to work with new vendors and open-source technologies in ways that weren’t possible before. But it also means that, instead of getting software updates a couple times a year from one or two well-known suppliers, you can now expect constant updates from dozens of vendors.
- Automation: Legacy manual approaches can’t keep up with the volume and velocity of change in cloudified networks. As complexity and costs grow, operators need to automate more of their operations and enable self-healing, self-optimizing networks.
- Artificial intelligence (AI): To enable true “self-driving networks,” you can’t stop and wait for human beings to make decisions. So, AI is playing play a larger role in network operations.
- Shift to work from home: Businesses were already seeing their workforces get more distributed, but COVID-19 kicked this trend into overdrive. Suddenly, operators need to deliver business-quality network experiences anywhere and everywhere.
As these trends converge, network traffic patterns become incredibly dynamic, elastic, and hard to predict. Just understanding what’s happening out there, much less isolating the source of issues, gets enormously difficult—especially if you’re relying on passive probes in static locations.
Getting Active
To navigate these issues and position themselves to succeed in the 5G marketplace, service providers are now extending active testing and assurance across more of their networks. Active monitoring involves three basic components:
- Active test agents—lightweight software probes that can run on any cloud compute platform and be spun up anywhere in the network, even on end-user devices
- Large testing libraries to cover a variety of simulations—voice calls, video sessions, web browsing, low-latency services, and more
- Intelligent automation, so the environment can not only run tests in the background continuously but can make smart decisions about which tests to run and where, without human input
By adopting active testing and assurance, you can:
- Monitor more proactively: With active testing always working in the background, you can continually probe your environment and spot most problems before they affect customers or SLAs.
- Accelerate change management: Active testing can become a default step when provisioning new services or network functions (NFs), immediately validating their performance as soon as they’re deployed. But it’s also valuable for contending with nonstop multivendor software updates. Now, you can rapidly test and validate updates in the live network, instead of having to wait weeks or months for lab testing.
- Assure SLAs: A growing number of services use hybrid environments, where parts of the service depend on cloud providers or other third parties. How do you guarantee that enterprise customers get the performance they’re paying for when you don’t fully own the service delivery infrastructure? The only way is to continually test the end-to-end service.
- Reduce mean time to repair (MTTR): If you’re relying on passive monitoring, you have to capture enough statistical data to feel confident that an anomaly signifies a real problem. Getting to that point takes time—especially if you’re waiting for organic traffic to recreate the conditions that caused the issue. Too often, while you’re waiting, customers are already calling to complain. With active monitoring, you can recreate any network conditions synthetically. And when you identify issues, you can isolate their source more quickly.
In early active testing deployments, we’ve seen operators reduce MTTR by close to 75% through rapid fault isolation. Just as important, they’re seeing trouble tickets fall by nearly 90% through proactive monitoring—meaning they’re fixing most issues before they ever impact customers.
Preparing for the Future
Active testing can be enormously useful in today’s telecommunications networks. But if you want to achieve your business objectives in the coming years for 5G, it’s absolutely essential. Whether you’re embracing DevOps software methodologies to accelerate innovation, offering low-latency enterprise services under SLAs, or driving down costs and complexity with self-driving networks, you can’t do any of it with passive monitoring. It’s time for active assurance.