Sunday, August 25, 2019

Blueprint: Kubernetes is the End Game for NFVI

by Martin Taylor, Chief Technical Officer, Metaswitch

In October 2012, when a group of 13 network operators launched their white paper describing Network Functions Virtualization, the world of cloud computing technology looked very different than it does today.  As cloud computing has evolved, and as telcos have developed a deeper understanding of it, so the vision for NFV has evolved and changed out of all recognition.
The early vision of NFV focused on moving away from proprietary hardware to software running on commercial off-the-shelf servers.  This was described in terms of “software appliances”.  And in describing the compute environment in which those software appliances would run, the NFV pioneers took their inspiration from enterprise IT practices of that era, which focused on consolidating servers with the aid of hypervisors that essentially virtualized the physical host environment.

Meanwhile, hyperscale Web players such as Netflix and Facebook were developing cloud-based system architectures that support massive scalability with a high degree of resilience, which can be evolved very rapidly through incremental software enhancements, and which can be operated very cost-effectively with the aid of a high degree of operations automation.  The set of practices developed by these players has come to be known as “cloud-native”, which can be summarized as dynamically orchestratable micro-services architectures, often based on stateless processing elements working with separate state storage micro-services, all deployed in Linux containers.

It’s been clear to most network operators for at least a couple of years that cloud-native is the right way to do NFV, for the following reasons:

  • Microservices-based architectures promote rapid evolution of software capabilities to enable enhancement of services and operations, unlike legacy monolithic software architectures with their 9-18 month upgrade cycles and their costly and complicated roll-out procedures.
  • Microservices-based architectures enable independent and dynamic scaling of different functional elements of the system with active-active N+k redundancy, which minimizes the hardware resources required to deliver any given service.
  • Software packaged in containers is inherently more portable than VMs and does much to eliminate the problem of complex dependencies between VMs and the underlying infrastructure which has been a major issue for NFV deployments to date.
  • The cloud-native ecosystem includes some outstandingly useful open source projects, foremost among which is Kubernetes – of which more later.  Other key open source projects in the cloud-native ecosystem include Helm, a Kubernetes application deployment manager, service meshes such as Istio and Linkerd, and telemetry/logging solutions including Prometheus, Fluentd and Grafana.  All of these combine to simplify, accelerate and lower the cost of developing, deploying and operating cloud-native network functions.

5G is the first new generation of mobile technology since the advent of the NFV era, and as such it represents a great opportunity to do NFV right – that is, the cloud-native way.  The 3GPP standards for 5G are designed to promote a cloud-native approach to the 5G core – but they don’t actually guarantee that 5G core products will be recognisably cloud-native.  It’s perfectly possible to build a standards-compliant 5G core that is resolutely legacy in its software architecture, and we believe that some vendors will go down that path.  But some, at least, are stepping up to the plate and building genuinely cloud native solutions for the 5G core.

Cloud-native today is almost synonymous with containers orchestrated by Kubernetes.  It wasn’t always thus: when we started developing our cloud-native IMS solution in 2012, these technologies were not around.  It’s perfectly possible to build something that is cloud-native in all respects other than running in containers – i.e. dynamically orchestratable stateless microservices running in VMs – and production deployments of our cloud native IMS have demonstrated many of the benefits that cloud-native brings, particularly with regard to simple, rapid scaling of the system and the automation of lifecycle management operations such as software upgrade.  But there’s no question that building cloud-native systems with containers is far better, not least because you can then take advantage of Kubernetes, and the rich orchestration and management ecosystem around it.

The rise to prominence of Kubernetes is almost unprecedented among open source projects.  Originally released by Google as recently as July 2015, Kubernetes became the seed project of the Cloud Native Computing Foundation (CNCF), and rapidly eclipsed all the other container orchestration solutions that were out there at the time.  It is now available in multiple mature distros including Red Hat OpenShift and Pivotal Container Services, and is also offered as a service by all the major public cloud operators.  It’s the only game in town when it comes to deploying and managing cloud native applications.  And, for the first time, we have a genuinely common platform for running cloud applications across both private and public clouds.  This is hugely helpful to telcos who are starting to explore the possibility of hybrid clouds for NFV.

So what exactly is Kubernetes?  It’s a container orchestration system for automating application deployment, scaling and management.   For those who are familiar with the ETSI NFV architecture, it essentially covers the Virtual Infrastructure Manager (VIM) and VNF Manager (VNFM) roles.

In its VIM role, Kubernetes schedules container-based workloads and manages their network connectivity.  In OpenStack terms, those are covered by Nova and Neutron respectively.  Kubernetes includes a kind of Load Balancer as a Service, making it easy to deploy scale-out microservices.

In its VNFM role, Kubernetes can monitor the health of each container instance and restart any failed instance.  It can also monitor the relative load on a set of container instances that are providing some specific micro-service and can scale out (or scale in) by spinning up new containers or spinning down existing ones.  In this sense, Kubernetes acts as a Generic VNFM.  For some types of workloads, especially stateful ones such as databases or state stores, Kubernetes native functionality for lifecycle management is not sufficient.  For those cases, Kubernetes has an extension called the Operator Framework which provides a means to encapsulate any application-specific lifecycle management logic.  In NFV terms, a standardized way of building Specific VNFMs.

But Kubernetes goes way beyond the simple application lifecycle management envisaged by the ETSI NFV effort.  Kubernetes itself, together with a growing ecosystem of open source projects that surround it, is at the heart of a movement towards a declarative, version-controlled approach to defining both software infrastructure and applications.  The vision here is for all aspects of a complex cloud native system, including cluster infrastructure and application configuration, to be described in a set of documents that are under version control, typically in a Git repository, which maintains a complete history of every change.  These documents describe the desired state of the system, and a set of software agents act so as to ensure that the actual state of the system is automatically aligned with the desired state.  With the aid of a service mesh such as Istio, changes to system configuration or software version can be automatically “canary” tested on a small proportion of traffic prior to be rolled out fully across the deployment.  If any issues are detected, the change can simply be rolled back.  The high degree of automation and control offered by this kind of approach has enabled Web-scale companies such as Netflix to reduce software release cycles from months to minutes.

Many of the network operators we talk to have a pretty good understanding of the benefits of cloud native NFV, and the technicalities of containers and Kubernetes.  But we’ve also detected a substantial level of concern about how we get there from here.  “Here” means today’s NFV infrastructure built on a hypervisor-based virtualization environment supporting VNFs deployed as virtual machines, where the VIM is either OpenStack or VMware.  The conventional wisdom seems to be that you run Kubernetes on top of your existing VIM.  And this is certainly possible: you just provision a number of VMs and treat these as hosts for the purposes of installing a Kubernetes cluster.  But then you end up with a two-tier environment in which you have to deploy and orchestrate services across some mix of cloud native network functions in containers and VM-based VNFs, where orchestration is driving some mix of Kubernetes, OpenStack or VMware APIs and where Kubernetes needs to coexist with proprietary VNFMs for life-cycle management.  It doesn’t sound very pretty, and indeed it isn’t.

In our work with cloud-native VNFs, containers and Kubernetes, we’ve seen just how much easier it is to deploy and manage large scale applications using this approach compared with traditional hypervisor-based approaches.  The difference is huge.  We firmly believe that adopting this approach is the key to unlocking the massive potential of NFV to simplify operations and accelerate the pace of innovation in services.  But at the same time, we understand why some network operators would baulk at introducing further complexity into what is already a very complex NFV infrastructure.
That’s why we think the right approach is to level everything up to Kubernetes.  And there’s an emerging open source project that makes that possible: KubeVirt.

KubeVirt provides a way to take an existing Virtual Machine and run it inside a container.  From the point of view of the VM, it thinks it’s running on a hypervisor.  From the point of view of Kubernetes, it sees just another container workload.  So with KubeVirt, you can deploy and manage applications that comprise any arbitrary mix of native container workloads and VM workloads using Kubernetes.

In our view, KubeVirt could open the way to adopting Kubernetes as “level playing field” and de facto standard environment across all types of cloud infrastructure, supporting highly automated deployment and management of true cloud native VNFs and legacy VM-based VNFs alike.  The underlying infrastructure can be OpenStack, VMware, bare metal – or any of the main public clouds including Azure, AWS or Google.  This grand unified vision of NFV seems to us be truly compelling.  We think network operators should ratchet up the pressure on their vendors to deliver genuinely cloud native, container-based VNFs, and get serious about Kubernetes as an integral part of their NFV infrastructure.  Without any question, that is where the future lies.

VMware: 10 million VMs run on VMware Cloud Provider clouds

At the opening of its annual VMworld 2019 event n San Francisco, VMware is announcing enhancements to its VMware Cloud Provider Platform. The company counts more than 4,300 VMware Cloud Providers in more than 120 countries operating out of more than 10,000 data centers, including AWS, Azure, Google Cloud and IBM Cloud along with strategic regional providers with specific geographic, vertical industry, or service expertise.

“VMware’s Cloud Provider strategy is to empower our partners with the flexibility to deliver the industrialized hybrid cloud, built on a VMware software-defined data center, from whatever location the customer chooses,” said Rajeev Bhardwaj, vice president of products, Cloud Provider Software Business Unit, VMware. “Today, more than 10 million VMs run on VMware Cloud Provider clouds. Through our SDDC everywhere cloud provider strategy, VMware and its Cloud Provider Partners help organizations operate more efficiently and create more value,  by enabling  meaningful savings in costs and time spent on day-to-day technology operations.”

At the heart of the VMware Cloud Provider Platform is VMware vCloud Director, an open and extensible cloud service-delivery platform. The latest release, vCloud Director 10, will include the following innovations:

  • Unified View of Hosted Private and Multi-Tenant Clouds: Cloud providers will be able to expand their cloud offerings to include both multi-tenant and private cloud with the natively integrated Centralized Point of Management (CPOM) capability in vCloud Director. The new capability will reduce  provider challenges and costs associated with building custom tooling to manage multiple types of cloud endpoints. Cloud providers benefit from a unified view of datacenter health and status of VMs across a global cloud estate of all cloud endpoints.
  • Intelligent Workload Placement for Greater Efficiency: Intelligent workload placement, which is powered by new vCloud Director compute profiles, will enable cloud providers to drive higher efficiency from their cloud infrastructure. Cloud Providers will be able to offer self-service consumption of tiered compute, enforcement of host-based licensing restrictions, and simplified selling based on workload sizes.
  • Advanced Automation: This release of vCloud Director will feature all-round improvements in automation capabilities, including an enhanced Terraform Provider that supports complete compute and network definition as code. VMware Cloud Providers will be able to target developers who want to use open source tooling in their cloud automation.
  • Multi-Cloud Networking: Extensive networking updates for VMware NSX-T are built into this release to prepare for greater support of multi-clouds and container environments, delivered through vCloud Director’s self-service consumption.



Splunk to acquire SignalFx for cloud monitoring

Splunk agreed to acquire SignalFx, a provider of SaaS real-time monitoring and metrics for cloud infrastructure, microservices and applications. The purchase price is approximately $1.05 billion, to be paid approximately 60% in cash and 40% in Splunk common stock.

SignalFX's analytics is built on a a massively scalable streaming architecture. The company is based in San Mateo, California is backed by Andreessen Horowitz, Charles River Ventures, General Catalyst, and Tiger Global Management.

Splunk said the acquisition strengthens its position as a leader in observability and APM for organizations at every stage of their cloud journey, from cloud-native apps to homegrown on-premises applications. 

“Data fuels the modern business, and the acquisition of SignalFx squarely puts Splunk in position as a leader in monitoring and observability at massive scale,” said Doug Merritt, President and CEO, Splunk. “SignalFx will support our continued commitment to giving customers one platform that can monitor the entire enterprise application lifecycle. We are also incredibly impressed by the SignalFx team and leadership, whose expertise and professionalism are a strong addition to the Splunk family.”

“By joining Splunk, we will create a powerful monitoring platform - one ready to support CIOs whether they have fully embraced cloud or have existing applications in the data center,” said Karthik Rau, Founder and CEO, SignalFx. “As the world continues to move towards complex, cloud-first architectures, Splunk and SignalFx is the new approach needed to monitor and observe cloud-native infrastructure and applications in real time, whether via logs, metrics or tracing. The SignalFx team is thrilled to join Splunk to help CIOs capitalize upon the modern application portfolio.”

https://www.splunk.com/en_us/newsroom/press-releases/2019/splunk-to-acquire-cloud-monitoring-leader-signalfx.html

Huawei advances its AI with Ascend 910 processor and MindSpore

Huawei officially launched its Ascend 910 AI processor as well as its "MindSpore" AI framework.

The Ascend 910, which is designed for AI model training, delivers 256 TeraFLOPS for half-precision floating point (FP16), and 512 TeraOPS for integer precision calculations (INT8). Its max power consumption is only 310W.  All of these are new industry benchmarks, according to the company.

Huawei claims its MindSpore AI framework is adaptable to all devices, edge, and cloud environments. It helps ensure user privacy because it only deals with gradient and model information that has already been processed. It doesn't process the data itself, so private user data can be effectively protected even in cross-scenario environments.

"We have been making steady progress since we announced our AI strategy in October last year," said Eric Xu, Huawei's Rotating Chairman. "Everything is moving forward according to plan, from R&D to product launch. We promised a full-stack, all-scenario AI portfolio. And today we delivered, with the release of Ascend 910 and MindSpore. This also marks a new stage in Huawei's AI strategy."

Xu also outlined ten areas where Huawei wants to drive change for AI:

  1. Provide stronger computing power to increase the speed of complex model training from days and months to minutes – even seconds.
  2. Provide more affordable and abundant computing power. Right now, computing power is both costly and scarce, which limits AI development.
  3. Offer an all-scenario AI portfolio, meeting the different needs of businesses while ensuring that user privacy is well protected. This portfolio will allow AI to be deployed in any scenario, not just public cloud.
  4. Invest in basic AI algorithms. Algorithms of the future should be data-efficient, meaning they can deliver the same results with less data. They should also be energy-efficient, producing the same results with less computing power and less energy.
  5. Use MindSpore and ModelArts to help automate AI development, reducing reliance on human effort.
  6. Continue to improve model algorithms to produce industrial-grade AI that performs well in the real world, not just in tests.
  7. Develop a real-time, closed-loop system for model updates, making sure that enterprise AI applications continue to operate in their most optimal state.
  8. Maximize the value of AI by driving synergy with other technologies like cloud, IoT, edge computing, blockchain, big data, and databases.
  9. With a one-stop development platform of the full-stack AI portfolio, help AI become a basic skill for all application developers and ICT workers. Today only highly-skilled experts can work with AI.
  10. Invest more in an open AI ecosystem and build the next generation of AI talent to meet the growing demand for people with AI capabilities.
At a press event in Shenzhen, Xu also told reporters that the company is working to replace design tools from Cadence and Synopsys, and that being placed on the U.S. entity list will not impact Huawei's AI ambitions.

MEF 3.0: An Ambitious Step for the Communications Industry



MEF Annual Meeting – July 2019, Michael Strople, President, Allstream and Chairman, MEF, shares his thoughts on the value of MEF 3.0 -- the transformational global services framework for defining, delivering, and certifying assured services orchestrated across a global ecosystem of automated networks.

“I’m really excited about where we’ve come with MEF 3.0. Carrier Ethernet 2.0 was a very important achievement, and it launched an US$80 billion industry – the gold standard for networking. MEF 3.0 is an ambitious step beyond that. With MEF 3.0, we build on top of the work in Carrier Ethernet, but we’ve added IP and Optical Transport, and most importantly – over the course of this last year – we’ve defined the first standard for SD-WAN and the software artifacts that go with that (LSO APIs).”

“MEF 3.0 includes certification, specification, the software that goes with it, and it also includes the notion of community – where we don’t have to reinvent what already exists but we’ve taken what exists in the network industry and put it all together.”

To learn more about MEF 3.0, visit: http://www.mef.net/mef30/overview


Download the industry's first SD-WAN standard here: https://click.icptrack.com/icp/relay....

To explore the latest on MEF 3.0 innovations and engage with industry-leading service and technology experts such as Michael Strople, attend MEF19 (http://www.MEF19.com), held 18-22 November 2019 in Los Angeles, California.

MEF 3.0 LSO Sonata Certification for Inter-Provider Service Automation



MEF Annual Meeting – July/Aug 2019, Bob Mandeville, President & Founder, Iometrix, provides an overview of the new MEF 3.0 LSO Sonata certification program for inter-provider service automation.

MEF 3.0 LSO Sonata certification enables buyers and sellers of wholesale MEF 3.0 services to validate that the full suite of their APIs used for inter-provider business transactions comply with MEF standards. The certification program will help companies develop, test, and certify all LSO Sonata APIs as they are released.


In June 2019, MEF launched the pilot MEF 3.0 LSO Sonata certification program with an initial focus on automating ordering of MEF 3.0 Carrier Ethernet Access E-Line services. Simultaneously, MEF introduced LSO Sonata SDK (Software Development Kit) Release 3 with APIs for inter-provider serviceability, product inventory, quoting, and ordering. Together, these important steps will accelerate implementation of standardized LSO Sonata APIs worldwide, driving frictionless inter-provider business processes and faster service delivery across the service provider community.

To learn more about LSO Sonata APIs and the certification program, download this FAQ document from MEF.net. (https://www.mef.net/images/LSO-Sonata-FAQ-August-2019.pdf)

Explore MEF 3.0 inter-provider LSO Sonata API innovations and engage with industry-leading service and technology experts, attend MEF19 (http://www.MEF19.com), held 18-22 November 2019 in Los Angeles, California.