Tuesday, October 3, 2017

100G - Challenges for Performance and Security Visibility Monitoring

by Brendan O’Flaherty, CEO, cPacket Networks

The 100Gbps Ethernet transport network is here, and the use cases for transport at 100Gbps are multiplying. The previous leap in network bandwidths was from 10Gbps to 40Gbps, and 40Gbps networks are prevalent today. However, while 40Gbps networks are meeting bandwidth and performance requirements in many enterprises, the “need for speed” to handle data growth in the enterprise simply cannot be tamed.

As companies continue to grow in scale, and as their data needs become more complex, 100Gbps (“100G”) offers the bandwidth and efficiency they desperately need. In addition, 100G better utilizes existing fiber installations and increases density, which significantly improves overall data center efficiency.

A pattern of growth in networks is emerging, and it seems to reflect the hypergrowth increases in data on corporate networks just over the last five years. In fact, the now-famous Gilder’s Law states that backbone bandwidth on a single cable is now a thousand times greater than the average monthly traffic exchanged across the entire global communications infrastructure five years ago.

A look at the numbers tells the story well. IDC says that 10G Ethernet switches are set to lose share, while 100G switches are set to double. Crehan Research (see Figure 1) says that 100G port shipments will pass 40G shipments in 2017, and will pass 10G shipments in 2021.



Figure 1: 100G Port Shipments Reaching Critical Mass, as 40G and 10G Shipments Decline

100 Gigabit by the Numbers

The increase in available link speeds and utilization creates new challenges for both the architectures upon which traditional network monitoring solutions are based and for the resolution required to view network behavior accurately. Let’s look at some numbers:

100G Capture to Disk

Traditional network monitoring architectures depend on the ability to bring network traffic to a NIC and write that data to disk for post-analysis. Let’s look at the volume of data involved at 100G:



(1)




(2)




(3)

By equation (3), at 100 Gbps on one link, in one direction, a one-terabyte disk will be filled in 80 seconds. Extending this calculation for one day, in order to store the amount of data generated on one 100 Gbps link in only one direction, 0.96 petabytes of storage is required:




(4)

Not only is this a lot of data (0.96 petabytes is about 1,000 terabytes, equivalent to 125 8TB desktop hard drives), but as of this writing (Aug 2017), a high-capacity network performance solution from a leading vendor can store approximately 300 terabytes, or only eight hours of network data from one highly utilized link.

100G in the Network – What is a Burst, and What is Its Impact?

A microburst can be defined as a period during which traffic is transmitted over the network at line rate (the maximum capacity of the link). Microbursts in the datacenter are quite common – often by design of the applications running in the network. Three common reasons are:

  • Traffic from two (or more) sources to one destination. This scenario is sometimes considered uncommon due to the low utilization of the source traffic, although this impression is the result of lack of accuracy in measurements, as we’ll see when we look at the amount of data in a one-millisecond burst.
  • Throughput maximizations. Many common operating system optimizations to reduce the overhead of disk operations or NIC offloading of interrupts will cause trains of packets to occur on the wire.
  • YouTube/Netflix ON/OFF buffer loading. Common to these two applications but frequently used with other video streaming applications is buffer loading from 64KB to 2MB – once again, this ON/OFF transmission of traffic inherently gives rise to bursty behavior in the network.
The equations below translate 100 gigabits per second (1011 bits/second) into bytes per millisecond:



(5)



(6)

The amount of data in a one-millisecond spike of data is greater than the total amount of (shared) memory resources available in a standard switch. This means that a single one-millisecond spike can cause packet drops in the network. For protocols such as TCP, the data will be retransmitted; however, the exponential backoff mechanisms will result in degraded performance. For UDP packets, the lost packets will translate to choppy voice or video, or gaps in market data feeds for algorithm trading platforms. In both cases, since the packet drops cannot be predicted in advance because the spikes and bursts will go undetected without millisecond monitoring resolution, the result will be intermittent behavior that is difficult to troubleshoot.

Network Monitoring Architecture


Typical Monitoring Stack
The typical network monitoring stack is described in Figure 2. At the bottom is the infrastructure – the switches, routers and firewalls that make up the network. Next, in the blue layer are TAPs and SPAN ports – TAPs are widely deployed due to their low cost, and most infrastructure devices provide some number of SPAN ports. The traffic from these TAPs and SPANs is then taken to an aggregation device (or matrix switch or “packet broker”) – at this point, a high number of links, typically 96 10G, are taken to a small number of tool ports, usually four 10G ports (a standard high-performance configuration). At the top are the network tools – these tools take the network traffic fed to them from the aggregation layer and provide the graphs, analytics and drilldowns that form dashboards/visualization.


Figure 2: Typical Network Monitoring Stack

Scalability of Network Monitoring Stack

Let’s now evaluate how this typical monitoring stack scales in high-speed environments.

·         Infrastructure: As evidenced by the transition to 100G, the infrastructure layer appears to be scaling well.

·         TAP/SPAN: TAPs are readily available and match the speeds found in the infrastructure layer. SPANs can be oversubscribed or alter timing, leading to loss of visibility and inaccurate assumptions about production traffic behavior.

·         Aggregation: The aggregation layer is where the scaling issues become problematic. As in the previous example, if 48 links are monitored by four 10G tool ports, the ratio of “traffic in” to monitoring capability is 96:4 (96 is the result of 48 links in two directions) or, reducing, an oversubscription ratio of 24:1. Packet drops due to oversubscription mean that network traffic is not reaching the tools – there are many links or large volumes of traffic that are not being monitored.

·         Tools: The tools layer is dependent on data acquisition and data storage, which translates to the dual technical hurdles of capturing all the data at the NIC as well as writing this data to disk for analysis. Continuing the example, at 96x10G to 4x10G at 10G, the percentage of traffic measured (assuming fully utilized links) is 4x10G/96x10G, or 4.2%. As the network increases to 100G (but the performance of monitoring tools does not), the percentage of traffic monitored drops further to 4x10G/96x100G, or 0.42%.

It is difficult to provide actionable insights into network behavior when only 0.42% of network traffic is monitored, especially during levels of high activity or security attacks.
Figure 3: Scalability of Network Monitoring Stack

Current Challenges with Traditional Monitoring Environments

Monitoring Requirements in the Datacenter

Modern datacenter monitoring has a number of requirements if it is to be comprehensive:  
  • Monitoring Must Be Always-On. Always-on network performance monitoring means being able to see all of the traffic and being able to perform drill-downs to packets of interest on the network without the delay incurred in activating and connecting a network tool only after an issue has been reported (which leads to reactive customer support rather than the proactive awareness necessary to address issues before customers are affected). Always-on KPIs at high resolution provide a constant stream of information for efficient network operations.
  • Monitoring Must Inspect All Packets. To be comprehensive, NPM must inspect every packet and every bit at all speeds—and without being affected by high traffic rates or minimum-sized packets. NPM solutions that drop packets (or only monitor 0.24% of the packets) as data rates increase do not provide the accuracy, by definition, to understand network behavior when accuracy is most needed – when the network is about to fail due to high load or a security attack.
  • High Resolution is Critical. Resolution down to 1ms was not mandatory in the days when 10Gbps networks prevailed. But there’s no alternative today: 1ms resolution is required for detecting problems such as transients, bursts and spikes at 100Gbps.
  • Convergence of Security and Performance Monitoring (NOC/SOC Integration). Security teams and network performance teams are often looking for the same data, with the goal of interpreting it based on their area of focus. Spikes and microbursts might represent a capacity issue for performance engineers but may be early signs of probing by an attacker to a security engineer. Growing response time may reflect server loads to a performance engineer or may indicate a reflection attack to the infosec team. Providing the tools to allow correlation of these events, given the same data, is essential to efficient security and performance engineering applications.
A Look Ahead

100G is just the latest leap in Ethernet-based transport in the enterprise. With 100G port shipments growing at the expense of 40G and 10G, the technology is on a trajectory to become the dominant data center speed by 2021. According to Light Reading, “We are seeing huge demand for 100G in the data center and elsewhere and expect the 100G optical module market to become very competitive through 2018, as the cost of modules is reduced and production volumes grow to meet the demand. The first solutions for 200G and 400G are already available. The industry is now working on cost-reduced 100G, higher-density 400G, and possible solutions for 800G and 1.6 Tbit/s.”


Broadcom's acquisition of Brocade faces delay

Broadcom's pending acquisition of Brocade is facing a regulatory delay. The companies withdrew and re-filed their joint voluntary notice to the Committee on Foreign Investment in the United States (CFIUS), triggering a new 45-day investigation period. Brocade and Broadcom now anticipate the acquisition to be completed by November 30, 2017, subject to clearance from CFIUS.

Brocade now plans to directly sell its data center switching, routing and analytics business to Extreme Networks, instead of waiting for the Broadcom deal to first and then making the sale, as had been previously agreed. Brocade and Extreme Networks expect this deal to close this transaction prior to the closing of Broadcom's acquisition of Brocade.

"We are actively engaged with CFIUS and remain committed to Broadcom's proposed acquisition of Brocade," said Lloyd Carney, CEO of Brocade. "We continue to work diligently and cooperatively with Broadcom to close the transaction as soon as possible in a challenging and dynamic policy and regulatory environment. In the meantime, we are pleased to announce an agreement to divest our data center networking business to Extreme Networks, which we believe is in the best interest of our shareholders, customers, partners and the employees aligned with the business."

Broadcom to Acquire Brocade for Fibre Channel Business

Broadcom agreed to acquire Brocade Communications Systems for $12.75 per share in an all-cash transaction valued at approximately $5.5 billion, plus $0.4 billion of net debt.

Broadcom plans to keep Brocade's Fibre Channel storage area network (FC SAN) switching business and divest Brocade’s IP Networking business, consisting of wireless and campus networking, data center switching and routing, and software networking solutions.

Broadcom expects to fund the transaction with new debt financing and cash available on its balance sheet.

The companies said the deal is not subject to any financing conditions, nor is it conditioned on the divestiture of Brocade’s IP Networking business.

Broadcom said key reasons for the acquisition include the profitability margin for Brocade's FC SAN business, which currently comprises vast majority of Brocade’s non-GAAP operating profit.

Extreme to Acquire Brocade's Switching Business for $55 Million

Extreme Networks agreed to acquire Brocade Communications Systems' data center switching, routing, and analytics business from Broadcom following Broadcom's acquisition of Brocade. The deal is valued at $55 million in cash, consisting of $35 million at closing and $20 million in deferred payments, as well as additional potential performance based payments to Broadcom, to be paid over a five-year term. The sale is contingent on Broadcom closing its acquisition of Brocade, previously announced on November 2, 2016 and approved by Brocade shareholders on January 26, 2017. Broadcom presently expects to close the Brocade acquisition in its third fiscal quarter ending July 30, 2017.

Extreme expects the acquisition to be accretive to cash flow and earnings for its fiscal year 2018 and expects to generate over $230 million in annualized revenue from the acquired assets. The acquisition is expected to close within 60 days following the closing of Broadcom's acquisition of Brocade.

"The add

US DoJ approves Centurylink + Level 3 merger with conditions

The U.S. Department of Justice cleared CenturyLink's pending acquisition of Level 3 Communications with certain conditions, including the divestiture of certain Level 3 metro network assets and certain dark fiber assets.

Specifically, the combined company is required to divest Level 3 metro network assets in Albuquerque, N.M.; Boise, Idaho; and Tucson, Arizona. In addition, the combined company is required to divest 24 strands of dark fiber connecting 30 specified city-pairs across the country in the form of an Indefeasible Right of Use (IRU). CenturyLink said that because these fibers are not currently in commercial use, this divestiture will not affect any current customers or services.

The acquisition sill requires regulatory approval from the Federal Communications Commission and the California Public Utilities Commission.

CenturyLink to Acquire Level 3 for $34 Billion

CenturyLink agreed to acquire Level 3 Communications in a cash and stock transaction valued at approximately $34 billion, including the assumption of debt.

The deal combines CenturyLink's larger enterprise customer base with Level 3's global network footprint. The companies said this scale will enable further investment in the reach and speeds of its broadband infrastructure for small businesses and consumers. After close, CenturyLink's Glen Post will continue to serve as Chief Executive Officer and President of the combined company.  Sunit Patel, Executive Vice President and Chief Financial Officer of Level 3, will serve as Chief Financial Officer of the combined company. The combined company will be headquartered in Monroe, Louisiana and will maintain a significant presence in Colorado and the Denver metropolitan area.

Under terms of the agreement, Level 3 shareholders will receive $26.50 per share in cash and a fixed exchange ratio of 1.4286 shares of CenturyLink stock for each Level 3 share they own, which implies a purchase price of $66.50 per Level 3 share (based on a CenturyLink $28.00 per share reference price) and a premium of approximately 42 percent based on Level 3's unaffected closing share price of $46.92 on October 26, 2016. Upon the closing  CenturyLink shareholders will own approximately 51 percent and Level 3 shareholders will own approximately 49 percent of the combined company.

EdgeX Foundry announces its first release for Edge IoT

EdgeX Foundry, which is the open source project hosted by The Linux Foundation that is focused on Internet of Things (IoT) edge computing, announced its first major code release.

The "Barcelona" software release, which will be available later this month, reflects the collaborative effort by more than 60 member organizations to build out and support an ecosystem for Industrial IoT (IIoT) solutions. The software release includes important work on “north side” Export Service interfaces that provide connectors to Azure IoT Suite and Google IoT Core as well as support for connections via MQTTS and HTTPS.

The EdgeX Foundry project was launched in April 2017.

“We believe that EdgeX will radically change how businesses develop and deploy IIoT solutions, and we are excited to see the community rally together to support it,” said Philip DesAutels, senior director of IoT at The Linux Foundation. “Barcelona is a significant milestone that showcases the commercial viability of EdgeX and the impact that it will have on the global Industrial IoT landscape.”

Zain Saudi Arabia tests NB-IoT

Zain Saudi Arabia is testing NB-IoT (Narrowband Internet of Things) technology at a live site in Mina area of Makkah Province. Nokia is the technology partner.

The trial focuses on smart metering. NB-IoT is used to communicate temperature, humidity and air pressure from a remote location via a Nokia Flexi Multiradio 10 LTE base station at 900 MHz.

NB-IoT is a 3GPP Release 13 radio access technology designed to enable connectivity to IoT devices.

IEEE publishes 5G and Beyond Technology Roadmap White Paper

IEEE published a white paper summarizing the challenges and opportunities in building and sustaining a 5G and beyond ecosystem.

The whitepaper, which is titled "5G and Beyond Technology Roadmap", describes key technology trends that will impact design drivers and challenges for technologies to provide simultaneous wireless communication, massive connectivity, tactile internet, quality of service and network slicing. Some topics addressed include: applications and services, hardware, MIMO, mm-wave, edge automation platform, security, standardization building blocks and testbed. The white paper is available for download at no cost on the IEEE 5G web portal’s Roadmap page. The IEEE 5G and Beyond Technology Roadmap will be periodically updated with forecasts for three-, five- and 10-year horizons.

“5G consolidates the trend in convergence of technologies and underlying standards with emerging solutions offering exciting possibilities not only to consumers but also to industries,” said Mischa Dohler, co-chair, IEEE 5G and Beyond Technology Roadmap Working Group. “Disruption will happen at many levels, most importantly through changes in the value chain, adoption of flexible systems management and orchestration, as well as emergence of innovative mobile connectivity technologies.”

The white paper can be downloaded here: https://5g.ieee.org/roadmap

Yahoo now believes all 3 billion user accounts hit by breach

Yahoo, which is now part of Verizon's Oath division, provided notice that all of its 3 billion user accounts were impacted by the 2013 data breach.  Previously, Yahoo had disclosed that more than one billion of the approximately three billion accounts existing in 2013 had likely been affected. The company believes that the user account information that was stolen did not include passwords in clear text, payment card data, or bank account information.

In Memorium: Paul S. Otellini, 1950 – 2017

Paul Otellini, who served as the fifth CEO of Intel from 2005 to 2012, passed away in his sleep Monday, Oct. 2, 2017, at the age of 66.

Otellini, who joined Intel in 1974 and rose through the ranks, is remembered for many accomplishments at the company. He successfully guided Intel through many technology and market transitions, including the financial turmoil of 2008. Intel noted that in the last full year before Otellini was named CEO, its revenue was $34 billion; by 2012, the number had grown to $53 billion. During his tenure, Intel won the Apple PC business and expanded its presence in security, software and mobile communications. As Intel CEO, Otellini was preceded by Craig Barrett and succeeded by Brian Krzanich.

During his retirement, Otellini was active in several philanthropic and charitable organizations, including the San Francisco Symphony and San Francisco General Hospital Foundation. He is survived by his wife, Sandy; his son, Patrick; and his daughter, Alexis.