Deploying Network Functions on Public Cloud? Challenges continued for Telcos!

(I have been often asked, if Telcos should use Public Cloud for deploying Network Functions. The answer mayn’t be given in simple Yes/No and it depends on multiple factors. The blog dives deep into question to find answer.)

Should Telcos choose Public Cloud (or hyperscalers) for 5G Network Function #NF deployments?

It’s an interesting question.

Two years back, when we saw hyperscalars thronged to MWC, Barcelona (MWC21), everyone talked about how public cloud is going to win over the complex beast named telcos. It appeared easy win at first glance!

Two years down the line (or two and half now), hyperscalers momentum couldn’t win many telcos as customer esp. from Network perspective. By network I mean, telcos choosing public cloud for deploying their 5G NSA or SA Core, RAN and other network functions. Barring Dish (AWS), AT&T (Azure) and few more, I don’t see many adopters for public cloud for Network workloads (NFV). There are just handful of examples and some of them are pilot stages or couldn’t scale to pan nation network yet.

But why aren’t many telcos adopting public cloud for network workloads?

There are few obvious reasons for not choosing public cloud. Let’s dive deep into them.

  1. Complexity of Network Functions (NFs): Both VNF CNF, are complex stateful applications. They may involve hundreds of microservices as one application or more. They could include databases, message queues, cache and storage apart from stateless containers or VMs. Typically, these NFs are quite large in size than enterprise applications getting deployed on any public cloud platforms. Deploying this scale of NFs on any virtualised or containerised platform is extremely complex process. Although public cloud offers rich tools for automated deployment, telcos haven’t yet zeroed down completely on adopting DevOps based automation. They are in early stage of automation adoption, which could not only complicate deployment of NFs but also upgrade and entire life cycle aspects of it.
  2. TCO: Cost or TCO is another crucial factor which weighs in slowly with hyperscalers. Public cloud providers use Pay-As-You-Go model. Running the complex and extremely large NFs all the time, puts the hole in pocket. Despite all discounts and ephemeral instances positioning, running heavy workloads on public cloud is expensive. All benefits of saving #CAPEX and simplicity of usages are gone, come 3-5 years down the line. I believe there’s no TCO benefit running #NFs for 5 years or beyond with hyperscalers. Moreover, with public cloud, you are entirely operating with OPEX model. The benefits of cloud elasticity, easy scale out and scale in doesn’t fit the OPEX math as well.
  3. Operations & Management: Operating public cloud with complex #NFs bring another layer of complexity in addition to managing existing private cloud deployments. Telcos already running a large network on-prem, including 4G LTE VNFs. Only with 5G, and arrival of cloud native, they are opting out for public cloud. Proponents might believe in creating a single pane of glass of observability across multi-cloud, doing resourceops, automation, security & policy automation with day1 and day2 ops is quite challenging and adds to ongoing opex costs budgets. Moreover, telcos typically lack skillset of manage and operate multi-cloud setup efficiently and cost effectively. It could add burden to telcos, if they decide to move large NFs to public cloud.
  4. Vendor Lock-in: While it is known fact by know that it’s easy to move in to public cloud, it’s not possible to leave them, if at all you wish to. Reason? The tech stack differences. Every cloud provider has its own way of building virtualisation and containerisation layers. On important level, they may appear same, but they are quite different in nutshell. They use different OS and their libraries to build platforms. Your applications containers, once sit on those platform layers, it’s extremely difficult to move them out to another platform, offering separate set of OS and libraries. We normally call it as vendor lock-in. Would a telco want to lock their big fat NFs to single public cloud provider platform? The cost associated to move workloads from one cloud to another is also huge and takes weeks or months to even execute the migration.
  5. Security/Data Sovereignty: Moving your NFs workloads to public cloud could lead to loss of control, esp. the infra and platform layers. You just don’t own those layers, as hyperscalars had invested millions or billions to build their own state of art DCs. It’s big deal. Moreover, as you move workloads to them, your data, which was until now was well guarded within on premises boundaries, now reside on some other public network (still private), gets processed and stored outside. It could be quite challenging to deal with this, albeit fact that, hyperscalars offer excellent set of security of their own stack, leaving door open for managing security to your workloads on your own. Moreover, there’s ingress and egress costs associated with data transfer, which is quite complex to factor in.
  6. Solutions Stitching: It’s important to understand why public cloud doesn’t solve your problems unless you do. Although you get multitude of ready to consume services with single click, in the end it’s telcos responsibility to create a complete solution blueprint out of those services. Moreover, you need to now manage and stitch multiple moving parts, as part of blueprint, even on single public cloud provider. Telcos have always relied on their vendors to help them here and possibly lack of skills could lead to not building right blueprints on cloud. Managing these architecture blueprints, and version controlling architectures over longer period is immensely challenging in multi-cloud environments.
  7. Workload Performance: While many telco grade workloads need to run at specific performance benchmark of network and compute latencies, including RAN (DU), it’s not easy to get similar set of performances on public cloud platforms. There are all sorts of hardware acceleration (SRIOV, DPDKs) required for faster data paths to meet latency requirements of workloads. While public cloud has evolved to great extent, it’s not easy to deliver similar performances always, and telcos could face challenges in delivery requisites performance.

While telcos are quite reluctant for adopting public cloud for network workloads, they are certainly embracing it for running their IT workloads namely BSS and OSS. Public cloud is quite well suited for these types of IT workloads and allow them the benefit of faster time to market for their offerings, and elasticity required for those services.

In the end, there’s no clear formulae here, if a telco should adopt public cloud for network, IT workloads or not. It depends on multiple factors including size and scale of workloads, current and future traffic growth, regional spread of services and also should be aligned with long term vision of telcos transformation. It could be seen that some medium to smaller telcos have quickly adopted public cloud for 5G, large telcos, with wider geographic spread, large customer base, mayn’t have agility to move to cloud easily. But again, there’s no thumb rule here.

People often cite names of early adopters when they debate about the subject. But I believe many of those adopter aren’t being able to run a nation-wide network, at decent traffic size. I would tag those names as outliers and not early adopters. It’s still to be seen that telcos can run their NFs on public cloud at scale for wider geographies handling decent traffic size, efficiently and cost effectively.

There’s an excellent discussion around the subject few months back from ABI Research, when they debated the topic about telco cloud vs public cloud. It could be seen that, ABI research found that, running NFs on telco cloud, on your premises is cost effective than running it on public cloud.

When we talk about public cloud, many proponents believe that, moving to OPEX model (Pay as You Go) wins over CAPEX model of DC build out. But again as per ABI Research, OPEX operating model of public cloud proven to be costly, compared to CAPEX based model. In other words, building your own private telco cloud, and running NFs on it, over a period of ten years, offers better TCO benefits.

While it is to be seen how telcos adopt public cloud, and address those challenges mentioned above. Telcos can still adopt public cloud for onboarding NFs, provided they address those challenges effectively by adopting solutions such as Red Hat OpenShift Container Platform, which offer no vendor lock-in approach for hybrid multi-cloud environment, offering complete control over costs, and manageability with single dashboard for containerised environments across on-prem, public cloud and edge.

(All views are personal and author doesn’t endorse or claim the credits of any research on subject by any third parties. Red Hat OpenShift and Public Cloud offerings are own by respective Organisations.)

ABI Research Whitepaper could be found here.

Tags : CNFsHyperscalaersNetwork FunctionsPublic CloudTCOtelco cloudVNF