Telco Cloud Security

10/3/2023

Cyber security risks are shifting as telecommunications providers transition to 5G networks. Most of the architectural components of 5G are virtualised across the core and the radio access networks. This includes Network Function Virtualisation (NFV) and containerisation which creates new opportunities in the market but also creates new attack vectors for hackers to exploit.

Protecting against attacks in 5G networks is increasingly complex due to the disaggregation of network functions across the cloud. Telcos are essentially becoming cloud service companies, which is a necessary technology and business capability, but this increases the attack surface and opens up a multitude of new attack vectors as a result of leveraging common web protocols and APIs.

The Telecoms Security Requirements (TSRs)

The UK government’s Telecoms Supply Chain Review Report published in July 2019, highlighted the security risks as well as the economic opportunities associated with the next generation of telecommunications networks, particularly 5G and full fibre networks. Since the Review was published, the government has put this recommendation into action, developing a new security framework for providers of public electronic communications networks or services through the Telecommunications Security Act 2021 otherwise known as ‘the TSA’.

The new telecoms security framework was developed in collaboration with the National Cyber Security Centre (NCSC), drawing on its technical expertise in cyber security matters relating to the telecoms sector. This Code of Practice provides guidance for large and medium‑sized public telecoms providers whose security is most crucial to the effective functioning of the UK’s telecoms critical national infrastructure (CNI). The 'technical guidance measures' specified in the Telecommunications Code of Practice, are broadly categorised as follows:

Overarching Security Measures
Management Plane
Signalling Plane
Third-Party Supplier Measures
Virtualisation
Monitoring & Analysis
Retaining National Resilience and Capability

Our focus in this article is primarily on the virtualisation layer in the form of Telco Cloud, but before we get into some of the strategies that will help to secure this cloud infrastructure, lets take a closer look at what it actually is.

An Overview of Telco Cloud

This is an industry term used to describe the private cloud infrastructure upon which 5G is built and significantly changes how the network is operated. Telco Cloud is not something that you can buy off the shelf – it’s not a single product or technology, but a collection of cloud technologies including compute, storage, networks, orchestration and so on. As the name suggests, the Telco Cloud is focused on the implementation, standardisation, and use of cloud technologies specifically in Telco environments. This reliance on cloud technology provides significant advantages:

Agility and flexibility
On-demand scalability
High levels of automation
Rapid innovation and go-to-market
Flexible consumption models

These are great benefits, but as mentioned, it does increase the attack surface and opens up a multitude of new attack vectors. However, this is nothing new in the world of cloud computing. Indeed, cloud security architecture has been tried and tested over many years with a multitude of security controls. These are not new ideas, they are just new to the telecoms industry. This is good news as the heavy lifting has already been done in other industries such as finance, retail, manufacturing and so on, that have been embracing digital transformation for years.

So cloud technologies of any flavour, whether that is public cloud, private cloud, hybrid cloud or telco cloud, all have security risks. It goes with the territory and that risk has to be managed. To secure 5G networks, your most important preparation revolves around design and implementation choices for new technologies, including virtualisation, containerisation, and orchestration. As mentioned, these technology capabilities, while new to telecom networks, are well-proven in other industries. We can learn from this, and apply this to the telecoms world - specifically, to the Telco Cloud. In the next section, we’ll take a closer look at how we can mitigate many of the cyber security risks in the Telco Cloud.

Threat Mitigation Strategies

To mitigate the risks of the myriad of threats, we need security controls. These include physical, technical and administrative controls that act as safeguards or countermeasures prescribed for a system, to protect the confidentiality, integrity, and availability of systems and their information.

The aim of these controls is to mitigate risk to an acceptable level. In this article, we'll be taking a look at the following areas and controls, but this is by no means an exhaustive list:

Supply Chain Security
Secure Multi-Tenancy
Zero Touch Architecture
Identity and Access Management
Privileged Access Management
Network Micro-Segmentation
Protecting the Management Plane
Orchestration and Automation
Infrastructure Hardening
Patching Updates
Firmware Updates
Encryption at Rest
Encryption in Transit
Secure Inter-NF Communications
Security Information Event Management
CNF Security

Where to Start?

Prior to deploying the telco cloud, the organisational security policy needs to be updated to incorporate the specific requirements of telco cloud in terms of people, process and technology that is consistent with any regulatory or legal requirements for the organisation. In the case of the telecoms industry in the UK, we have the Telecommunications Security Act 2021 and the Telecoms Security Requirements and there are some specific milestones for compliance as we discussed in the previous article.

Overall, the security policy spells out the rules, expectations, and overall approach that an organisation uses to maintain confidentiality, integrity, availability, authenticity, and non-repudiation of it's data.

Supply Chain Security

5G networks depend on a vast and distributed supply chain that includes hardware, software and services. Lets take hardware as an example, and in this scenario, a server, because all software packages, whether they are virtualised or containerised, are hosted on some type of server hardware.

Components are sourced from many vendors, that may be distributed around the world, and then brought together in a factory, where the server is assembled. From here the server is tested, then packaged and shipped. It may go through customs and then onward to distributors until it is finally delivered to the customer site, perhaps a data centre in a telecoms company. As we can see, the server hardware passes through a number of touchpoints in its journey to the customer and we need to ensure that the server has not been tampered with in any way during that journey. We need to ensure the integrity of the server hardware. This is the essence of a secure supply chain. There are many initiatives that help to secure the supply chain and indeed, vendors may well implement their own version of a secure supply chain.

An example of one feature that can be implemented to increase security in the supply chain, specific to this example of a server, is Secure Boot. This is a security standard developed by members of the computer industry to help make sure that a device boots using only software that is trusted by the Original Equipment Manufacturer (OEM). We’ll discuss this later in the article.

Secure Multi-Tenancy

A fundamental concept of telco cloud is that of the tenant. A tenant is a logically isolated construct representing a customer, department or network function or service used to deploy workloads. Multi-tenant Telcos must ensure that each tenant is fully secure from attacks, breaches, or insecure communications from other tenants.

The Telco Cloud infrastructure can be configured into a secure multi-tenant environment across network functions through the following means:

Micro-segmentation with granular access controls for provider and tenant administrators.
Identity and access management with role-based access control based on least privileged access.
Tenant-level operations management and visibility.

Many of the features of micro-segmentation, identity and access management, role-based access etc are dependent on the vendor solution being implemented but we'll take a closer look at the architecture and principles in this article.

Zero Trust Architecture

Zero trust is the term for an evolving set of cybersecurity paradigms that move defences from static, network based perimeters to focus on users, assets, and resources. Zero trust assumes there is no implicit trust granted to assets or user accounts based solely on their physical or network location or based on asset ownership.

Authentication and authorisation for both users and devices are discrete functions performed before a session to a specific resource is established. Essentially, zero trust focuses on protecting resources (assets, services, workflows, network accounts, etc.), not network segments, as the network location is no longer seen as the prime component to the security posture of the resource. Zero trust involves the following principles:

Authentication and Authorisation: All transactions are authenticated and authorised based on multiple factors including user identity, device identity, location, access pattern, and more.
Least Privileged Access: The authorisation method ensures that every session is given the least possible privilege. In other words, just enough to complete the task required.
Design for Breach: Networks are designed to contain breaches with segmentation, firewalls, and intrusion detection at every interface point.

Many of the security controls discussed below are components of the Zero Trust Architecture.

Identity and Access Management

Identity and access management (IAM) is the practice of making sure that users or services have the right level of access to resources such as systems, applications, network resources etc. User roles and access privileges are defined and managed through an IAM system. When granting system access to individuals, we need to apply the principle of least privilege, and only grant a user the system privileges they actually need to complete their tasks.

Identity services support Role Based Access Control (RBAC), where users belong to specific group based on their role. Identity services reference the roles of the user attempting to access a particular service and will take into consideration the policy rule associated with each resource then the user’s group/roles and association to determine if access is allowed to the requested resource.

Privileged Access Management

Privileged credentials serve as the keys to an organization's IT kingdom, providing access to sensitive data and critical systems. However, these credentials are highly sought-after by external attackers and malicious insiders who attempt to gain direct access to the heart of the enterprise. As a result, an organization's security and protection of its sensitive data is only as strong as its privileged credentials. To authenticate users and systems to privileged accounts, most organizations utilize a mix of privileged credentials such as passwords, API keys, certificates, tokens, and SSH keys. To maintain their security, all of these credentials must be securely stored, rotated, and additionally authenticated for each use with multifactor authentication (MFA). If left unsecured, attackers can easily obtain these valuable secrets and credentials, leading to the compromise of privileged accounts, the advancement of attacks, or the exfiltration of data.

Privileged Access Management (PAM) is a set of processes and technologies used to manage and secure privileged access to sensitive resources within an organization. Privileged access refers to access to systems, applications, and data that is granted to users with elevated permissions or privileges, such as system administrators, network administrators, and database administrators.
The primary goal of PAM is to ensure that only authorized users can access sensitive resources, and that those users are using their privileged access in a controlled and monitored manner.

Network Micro-Segmentation

Network micro-segmentation, especially of critical components, limits the potential reach of a hacking incident and limits the lateral movement in the event of a breach. The technology underpinning 5G networks makes segmentation both easier and harder. But why is this?

Segmentation in cloud-native networks is easier because software-defined networks allow any granularity of segmentation through configuration, and can flexibly be adopted to evolving functional requirements. Firewalls are also important as they provide another layer of technical control. It is also harder to implement because new connections between network elements arise from virtualisation and/or containerisation whereas in earlier network generations, the only connection between elements was a network cable. Today, different network functions typically reside as containers in the same cluster and the velocity and volume of CNF updates on the 5G network have increased significantly and are more complex due to the disaggregation.

For segmentation to be effective in a containerised deployment, the CNFs need the appropriate security configuration to prevent hackers from being able to escape to other containers and the underlying infrastructure. As mentioned, virtual firewalls can be combined with micro-segmentation and dynamic security policies to separate and protect all types of network traffic, virtual machines, containerised applications, and workloads. The ability to segment the network in this way, restricts the lateral movement in the event of a breach.

Protecting the Management Plane

Protecting the management plane of the virtualisation layer is critical and this is a big focus of the Telecoms Security Requirements (TSRs) we discussed in the previous article. We need to design the architecture so that the management elements, including the administrative network and the VIM, can be isolated from other aspects of the virtual infrastructure.

Management plane components, such as the Virtualised Infrastructure Manager (VIM) can reside in a trusted segment or security zone that is segmented from the virtualisation layer. Virtual firewalls, micro-segmentation, and security groups protect the trusted domain. The management interfaces use secure, encrypted channels to administer virtual network functions or VNFs, the VMs and the hypervisor. Advanced security policies and rules can be applied at the virtualisation layer boundary to further control access to the management plane.

Orchestration and Automation Layer

Telco cloud orchestration and automation is important for efficient NFV lifecycle management, including onboarding, instantiation, configuration, scaling, updating, healing and monitoring.

However, this also introduces an expanded attack surface to exploit vulnerabilities from improper isolation, insecure API implementation that leads to risks such as unauthorised user and function-level access, excessive data exposure, and broken object level, or indeed any of the top 10 API vulnerabilities identified by OWASP.

Several steps can be taken to mitigate risks in orchestration and automation, including:

A strong secure API strategy. This could include regular audits and pen testing on APIs and the apps using APIs.
Segmentation of the Orchestration and Automation layer.
Implementation of RBAC, to authorise access to the Orchestration and Automation layer.
Security monitoring of the Orchestration and Automation layer to detect any behavioural anomalies, helping to mitigate risks such as data exfiltration.

Infrastructure Hardening

We must also consider the end-to-end security posture of the underlying physical infrastructure, including compute, storage and network components. These systems will require their own security hardening practices based on the vendor's guidance.

This typically involves disabling all unused services and ports, changing default root passwords and so on. For example, some servers have a chipset that acts as a silicon root of trust and includes an encrypted hash embedded in the silicon hardware at the chip fabrication facility. The hardware determines whether to run the firmware and does so only if it matches the encryption hash that is permanently stored in the chipset silicon. Some server hardware vendors embed the validation signatures in the ASIC, which means that they are burned into the ASIC and therefore the silicon root of trust protects the server from any firmware attacks all the way through production, shipping, distribution, and the entire supply chain process.

As mentioned earlier, a feature called Secure Boot ensures that a device boots using only software that is trusted by the Original Equipment Manufacturer (OEM). Secure boot leverages the chain of trust rooted in the silicon root of trust to ensure that every component in the boot chain is trusted. Secure boot is intended to prevent boot-sector malware or kernel code injection. It ensures that each component launched during the boot process is digitally signed and that the signature is validated against a set of trusted certificates loaded in the UEFI BIOS.

Unified Extensible Firmware Interface (UEFI) is a specification for a software program that connects a computer's firmware to its operating system (OS). UEFI is expected to eventually replace basic input/output system (BIOS) but is compatible with it.

Patching

Your operating procedures should incorporate a process for learning about new vulnerabilities and security updates. Hardware and software vendors typically announce the existence of vulnerabilities, and could offer workarounds and patches to address these.

Regular patching and hardening remain critical in shortening the time window during which network elements are exploitable through newly discovered security vulnerabilities. Cloud-native 5G makes patching both easier and harder compared to earlier network generations. Easier, because network elements can be rebuilt on the fly through an automated CI/CD pipeline, each time including the latest patches and configuration settings. Harder, because virtually all patching must be done through these automated processes due to the larger scale and complexity of 5G compared to earlier generations.

For cloud-native networks, it is best practice to maintain “golden” master images with the latest patches and configuration settings, and to base all containers on these images. A packer automatically builds network elements by combining a golden image with element-specific software, and stores the images in a repository. Automated deployment processes use these artifacts to deploy any number of required network elements, each including the latest security baseline.

Firmware Updates

Physical servers use complex firmware to enable and operate server hardware and lights-out management cards, which can have their own security vulnerabilities, potentially allowing system access and interruption. To address these, hardware vendors will issue firmware updates, which are installed separately from operating system updates. You will need an operational security process that retrieves, tests, and implements these updates on a regular schedule, noting that firmware updates often require a reboot of physical hosts to become effective.

Node and Workload Attestation

This is a means of certifying the identity of a server or workload against a central attestation authority. The unique identifier for the server could be an identity that is immutable and burned into the hardware, such as a hardware root of trust or trusted platform module. Node attestation can help in remote boot attestation for cloud workloads such as is provided by the CNCF Keylime project. Remote boot attestation can prevent a virtual machine from booting if the underlying server is not attested. In addition, the integrity of workloads can be continuously monitored using the Linux Integrity Measurement Architecture (IMA). The workload attestation can be revoked if the IMA indicates that the workload signature has changed from what has been registered in the framework.

An attested server is trusted and can be allowed to run authorised workloads. But how can workloads that are provided by different vendors, and running on different hardware platforms trust each other? Workload attestation allows software entities interacting in complex service-based architectures to each have a unique, verifiable identity; and to use this identity to be able to trust one another. The CNCF SPIFFE/SPIRE framework is one that enables this trust mechanism.

Node attestation helps in mitigating the risk of rogue elements being introduced into the network and the threats arising from running workloads obtained from many different vendors on the telco infrastructure. Node and workload attestation helps in mitigating against Day 0 threats. If all new hardware and software elements that are to be introduced into a data centre are meticulously fingerprinted before being deployed, there is little opportunity for rogue hardware or software to be introduced.

Network Attestation

The objective of network attestation is to prevent a server (or a virtual workload) from attaching to the network and sending/receiving traffic or even being assigned an IP address based on an authentication and authorisation mechanism.

The IEEE 802.1X standard is one such mechanism that provides port based network access control at Layer 2. The authenticator is a process on the switch that blocks the connected port until the server or workload (known as the supplicant) authenticates via an exchange of keys.

Network attestation helps in mitigating the risk of rogue elements being introduced into the network and the threats arising from running workloads obtained from many different vendors on the telco infrastructure.

Encryption at Rest

Cloud systems that are based on Virtual Machines or containers use volume storage. As a result, volume encryption is critical to secure the data on VMs and physical storage medium from being accessed, leaked or stolen by attackers. Encrypting volume data and volume backups can help mitigate these risks and provides defence-in-depth to volume-hosting platforms.

Storage device encryption requires that all data being written to the device be encrypted with a secret key and decrypted while being read. Encryption of storage devices offers several benefits:

It mitigates the fallout of theft. Although theft may still cause monetary pain, the loss of confidential information (much greater risk) is mitigated if the storage devices are encrypted.
Data cannot be extracted without knowledge of the encryption key.

There are several types of mechanisms by which encryption can be enforced:

Software-based encryption

The encryption and decryption are performed by a software layer before writing and after reading data from a storage device.

Hardware-based encryption

In the case of hardware-based encryption, the encryption and decryption are performed by a dedicated hardware device before writing and after reading data.

Encryption in Transit

Secure methods of communication over a network are becoming increasingly important and in many cases, mandated. Encryption is typically used to provide additional confidentiality and integrity of data as it transits a network.

Traffic in the Telco Cloud is encrypted in transit using Transport Layer Security (TLS) with an industry-standard Advanced Encryption Standard (AES) cipher. TLS is a set of industry-standard cryptographic protocols used for encrypting information that is exchanged over the network. AES-256 is a 256-bit encryption cipher used for data transmission in TLS.

The UK NCSC provide guidance on encryption of data in transit which states that you should be sufficiently confident that:

Data is protected in transit between your end user device(s) and the service.
Data is protected in transit as it flows between internal components within the service.
Data is protected in transit where exposed to other external services, such as via an API.

TLS versions 1.3 and 1.2 are the recommended versions for deployment. TLS versions 1.1 and 1.0 have been formally deprecated by the IETF. TLS 1.3 introduced some major changes in that it uses different cipher suite definitions to earlier TLS versions, and has different configuration options.

Secure Inter-NF Communications

For secure inter-NF (Network Function) communications within the 5G core, Service Based Architecture defined by 3GPP specifies authentication, authorisation, and encryption of API calls between the 5G core Network Functions. The authentication and transport security using encryption can be provided by TLS 1.2 or 1.3 as discussed above. Token-based authorisation using OAuth 2.0 can be used for authorisation of NFs. 3GPP has also taken steps in enhancing the security for the external API communication by introducing security features and security mechanisms for the common API framework (CAPIF).

Security Information and Event Management

Security Information and Event Management (SIEM) systems ingest data from various hardware and software touchpoints such as the server hardware logs, IDS/firewall logs, and physical security system logs. They also perform constant real-time analytics to detect ongoing or attempted attacks.

SIEM systems have the advantage of a library of ready made rules that are tuned to detect different types of attacks; they also come with features such as compliance and audit reports that can help with regulatory reporting.

Consideration should be given to integrating an operations management suite for monitoring and remediation of the NFVI and VNFs. The platform should provide continuous, context-aware visibility over service provisioning, workload migrations, auto-scaling, elastic networking, and network-sliced multi-tenancy across VNFs, hosts, clusters, and sites. Alerts can flag configuration and compliance gaps and security vulnerabilities.

The management suite can profile and monitor traffic segments, types, and destinations to recommend security rules and policies for traffic. It can also identify violations of security policies or vulnerable configurations and traffic routes.

CNF Security

Whilst containerisation has many benefits, it does come with some challenges including:

Inflow of vulnerable source code
Large attack surface
Noisy neighbouring containers
Container breakout to host
Network based attacks
Bypassing isolation
Ecosystem complexity

When deploying containerised network functions (CNFs), we need to consider how to secure the container lifecycle. This involves securing the CNFs as they move through the CI/CD pipeline and we’ll discuss this in more detail in a future article on DevSecOps, however, in the meantime, here are a few points to consider:

Implement a trusted container image registry with role-based access control and vulnerability scanning
Automate security patching of containers.
Isolate, protect, and monitor the communications of CNFs and microservices
Enforce policies governing CNF connectivity.
Protect your CNF supply chain by establishing end-to-end security from code provenance to CNFs running in production.

Conclusion

We've covered a lot of ground in this article but this isn't an exhaustive list of threat mitigation strategies and controls for Telco Cloud. There is a lot more to discuss! Indeed, we have primarily focused on administrative and technical security controls, however, there are many other considerations in the broader security architecture. What is clear is that security is intrinsic to the Telco Cloud. It should be secure by design and integrated in every layer of the architecture so that security is programmable, automated and context aware.

0 Comments

Cybersecurity Architecture