Containers

Rafay accelerates SonicWall’s adoption of containers and Amazon EKS

This post was contributed by Carmen Puccio, Principal Solutions Architect, AWS, and Haseeb Budhani, Co-Founder and CEO, Rafay Systems

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details.


Background

SonicWall, a storied technology company, delivers a broad suite of security products to protect enterprises and small and medium businesses (SMBs). Since the company’s founding, SonicWall has offered security products as hardware appliances or as downloadable software. As SonicWall’s customer base began migrating their applications to the cloud, SonicWall made the strategic decision to adopt cloud computing for its backend and management services. Further, SonicWall’s enterprise and service provider customers consume SonicWall offerings within their private cloud environments.

Key requirements

To fully leverage the cloud paradigm and deliver a highly scalable set of cloud-based offerings, SonicWall identified three key requirements:

  1. Core services must be reimplemented using the microservices paradigm, as Docker containers, such that the applications can be easily distributed across multiple cloud regions.
  2. Because of the global customer base, select a cloud provider that delivers a mature set of managed application services across a number of global locations.
  3. A (Kubernetes based) containerized application orchestration and operations solution that automates cluster deployment and applications operations in public cloud, on premises, and in customer environments.

Microservices adoption

Modernizing SonicWall’s portfolio of cybersecurity and management offerings was a task that the company’s engineering and cloud operations teams were eminently qualified for. As part of the modernization exercise, the teams also ensured that the management solution(s) supporting each of the company’s offerings were able to scale across multiple cloud regions and across tens of thousands of customers.

Cloud provider selection

As a mature company, SonicWall intended to partner with an equally mature provider that:

  • Is highly reliable
  • Provides a global compute footprint
  • And delivers a variety of managed services, including:
    • Secure networking
    • Relational and document databases
    • Log management
    • Secure container registry
    • Serverless capabilities

With these prerequisites in mind, SonicWall chose Amazon Web Services (AWS) as its primary cloud provider. SonicWall quickly began using a number of AWS services, such as:

Kubernetes operations and management service selection

Given SonicWall’s preference for managed services, choosing Amazon Elastic Kubernetes Service (EKS) for containerized applications running in AWS Regions was an easy decision for the team.

AWS makes it easy for customers to run Kubernetes by offering EKS as a managed Kubernetes service. The key benefit of using EKS is that customers can focus on configuring and operating Kubernetes for application operations, while AWS handles the operational responsibilities of installing, operating, and maintaining the Kubernetes control plane. Because EKS is a CNCF-certified, upstream Kubernetes distribution, customers can run their tools and plugins built and maintained by the Kubernetes community.

SonicWall’s cloud operations team was quick to recognize that end-to-end automation would help them get multiple product lines to the cloud faster. The team identified a clear list of requirements for Kubernetes management and operations, and began evaluating off-the-shelf options that provide end-to-end automation for Kubernetes management and operations. SonicWall’s requirements list follows.

Support for data center and customer premises deployment in addition to EKS

With future deployments of their modernized applications expected to take place in partner data centers and possibly on customer premises, the team concluded that support for hybrid environments, along with deployments in third party networks, was a critical requirement. The team there made it a hard requirement for the service to provide a single toolchain for diverse deployments. The service must provide a fully managed Kubernetes offering that can be deployed outside of AWS environments, so operations can deploy EKS clusters in AWS Regions, as well as fully managed upstream Kubernetes clusters in non-AWS environments, from a single pane of glass

Further, when deploying and operating application infrastructure in a partner or customer network, the service must not require third parties to set up complex VPNs to enable application deployments, nor should the service require third parties to comply with special reverse proxy or load balancing requirements. With applications getting deployed to tens of target clusters over time, the operational intelligence needed to ensure all target clusters are healthy and running the right versions of software, etc., must ideally be incorporated within the service.

Delegated administration and governance

With well over a decade of running SaaS security services, the SonicWall operations team holds a nuanced perspective on governance and delegated administration. The team has learned over the years that giving developers read-only access to production environments reduces mean time to resolution (MTTR) when operational issues are identified. Further, ensuring that the cloud operations team has full visibility and control over not just production environments, but also over developer and pre-production environments ultimately leads to faster feature rollouts and smoother operations.

All delegated administration and access must be tied to an enterprise single sign-on (SSO) system, e.g. an identity provider (IdP) based on the SAML 2.0 specification.

Self-service developer sandboxes

With a commitment to not introducing bottlenecks for developer productivity, SonicWall operations prefers to enable self-service options for developers. Depending on the scenario, developers should be able to deploy to shared or dedicated sandboxes on pre-assigned clusters. Key developers or engineering leads can be assigned cluster provisioning permissions as needed. No prior Kubernetes knowledge must be needed to provision, deploy to, or tear down sandboxes or clusters.

Multi-cluster deployment and operations

SonicWall serves customers across the globe, and expects to operate its security and management offerings in multiple AWS Regions and in partner networks. The team’s expectations around operational simplicity dictated that the Kubernetes management and operations solution must be able to distribute and continuously verify operational status of deployments across all target clusters. Instead of building intelligence about available target clusters, which clusters are presently healthy/reachable, etc., into the company’s CI/CD systems, the Kubernetes management and operations service must support application distribution across multiple clusters, with capabilities to ensure that the cluster fleet is running the right versions of the software based on policy.

SonicWall’s cloud operations team understands and appreciates that different product teams prefer different CI systems. Whether it’s Jenkins, Gitlab, or CircleCI, product teams must have the flexibility to select what they like. The Kubernetes management and operations service must provide turnkey integrations with a variety of CI systems, so DevOps teams don’t have to build and maintain brittle integrations on an ongoing basis.

With deployments taking place in AWS and non-AWS environments, target clusters may be a mix of EKS and non-EKS environments. The service must therefore be able to deploy across multi-distro environments. The service must also support Kubernetes upgrade workflows for both EKS and upstream Kubernetes based clusters, making it easy for the operations team to manage cluster lifecycle from a single pane of glass.

Registry integrations

With clusters being operated in AWS and in partner networks, container registry selection is critical. With SonicWall’s preference for managed services, Amazon Elastic Container Registry (ECR) is the right choice to house dev and production container images. EKS clusters can easily be configured to pull container images from ECR, but if ECR is also the desired registry for Kubernetes clusters operating in non-AWS environments, pull secrets need to be populated and updated periodically in these non-AWS clusters. The lifecycle management of pull secrets must be provided by the Kubernetes management and operations service, making it easy for the operations team to deploy and operate clusters anywhere without worrying about spinning up multiple container registries.

Secrets Manager integrations

With clusters being operated in AWS and in partner networks, secrets distribution and management is a critical issue that operations must address. Customers have a number of options for secrets storage and management, ranging from Amazon Secrets Manager to third party offerings such as Hashicorp Vault. Providing a turnkey option to distribute secrets to all clusters under management is a key requirement for the Kubernetes management and operations solution, making it easy for the operations team to deploy and operate clusters anywhere without worrying about ongoing auditing of clusters for unused secrets, etc.

Logs and metrics integrations

Multi-level visibility is key for smooth operations of modern applications across regions and clouds. Operations teams must be able to easily track node and container health metrics, whereas DevOps and development teams need easy access to application logs and metrics to ensure that end customers are enjoying the best possible experience.

It’s important to note that logs and metrics for the various components in each cluster are best aggregated in a regional or central repository that is housed outside of the cluster. This way, developers and operations engineers have easy access to relevant information to carry out root cause analysis (RCA) in case of a Kubernetes-related or application issue.

Towards that end, the Kubernetes management and operations service must pool relevant node, cluster, and container metrics into a portal that both operations and development teams can access as needed. The service must also provide easy integrations for applications to leverage managed services for log (Fluentd based) and metrics (Prometheus based) collection.

Flexibility in deployment options

Not all applications are created equal. Some security applications may consistently drive tens of gigabits of traffic per second, whereas other applications may see sporadic traffic based on need. Some applications may have unique requirements such as GPUs, while others may not.

SonicWall operations preferred to have the flexibility to mix and match applications with environments as needed:

  • Programmatically map applications to namespaces within a single cluster. This works well when all applications in question exhibit similar network and compute requirements.
  • Programmatically map applications to node groups in EKS clusters, or labeled nodes in non-EKS clusters. This works well when applications deployed to a cluster have unique hardware requirements, and a node group is set up to meet needs such a GPUs or special network cards.
  • Programmatically map applications to dedicated clusters. This works well when applications exhibit a significant imbalance in throughput needs, etc., where some applications are better off being allocated dedicated clusters.

Operational visibility and auditing

As SonicWall operates a fleet of Kubernetes clusters in development, pre-production, and production environments, with multiple applications being deployed across clusters, having a single-pane-of-glass view across the fleet as well as a detailed view into application status and performance is key. Further, all activity must be audited for easy validation and attestation as needed.

Why SonicWall partnered with Rafay

Rafay Systems delivers a turnkey offering that automates Kubernetes cluster management and application operations at scale. The solution offers a deep integration with Amazon EKS so developers and IT users can easily bring up and manage the lifecycle of EKS clusters across AWS Regions. The solution also offers a fully managed Kubernetes service that customers can leverage on premises or at the Edge.

SonicWall’s broad requirements list is typical of mature enterprises running workloads in the cloud and on premises. SonicWall’s experiential expectations for developers and IT operations are also in line with the post-modernization model that many enterprises are adopting. Rafay enables SonicWall and other customers to adopt a self-service model with a high level of visibility and governance.

Operational support

In addition to a differentiated Kubernetes management and operations solution, Rafay also helps companies jumpstart their Kubernetes journey with a deep bench of Kubernetes experts. Because Kubernetes is a relatively new system that is maturing fast with new features and capabilities, companies prefer to partner with vendors such as Rafay, who not only provide a differentiated solution, but also engage directly with DevOps and operations teams to customer success. With Rafay, customers can deliver modern applications to market on time, without waiting multiple quarters or years for internal teams to acquire expertise around Kubernetes and its orbit of technologies.

SonicWall Requirements Rafay
Turnkey lifecycle mgmt (provisioning, upgrades, etc.) for Amazon EKS    ✔
Turnkey lifecycle mgmt (provisioning, upgrades, etc.) for upstream Kubernetes    ✔
No public IP requirement for Kubernetes API server endpoint    ✔
Role-based access based on enterprise identity (via SAML 2.0 SSO)    ✔
Self-service developer sandboxes    ✔
Multi-cluster (EKS + upstream Kubernetes) deployment and distribution    ✔
Integrations with CI systems, e.g. Jenkins, GitLab and CircleCI    ✔
Integrations with container registries with pull-secret auto-refresh support    ✔
Integrations with secrets managers, e.g. Hashicorp Vault    ✔
Integrations with logs and metrics aggregation systems    ✔
Deployment flexibility (per namespace, per node group, per cluster, multi-cluster)    ✔
Governance and visibility across entire fleet for cloud operations and IT    ✔
Operational support model    ✔

Conclusion

By partnering with AWS and Rafay, SonicWall successfully met its infrastructure provisioning and application roll-out targets. They achieved a 50% speed up in their delivery timelines by rolling out their solution within three months. Furthermore, SonicWall had expected to hire five additional DevOps engineers to meet these targets. The ease of adopting and deploying the AWS + Rafay solution negated that need, thereby increasing developer productivity. Applications are now operational in up to six AWS Regions. Rafay’s operational support team continues to support SonicWall as it serves its global customer base. Together, AWS and Rafay look forward to supporting SonicWall in their transformation goals for years to come. If you’d like to learn more about Rafay, please visit the website or email the team to schedule a demo.