AWS Partner Network (APN) Blog

Customers enhance SAP availability with SUSE solutions on AWS

By Sherry Yu, Global Alliance Director – SUSE
By Soumya Sekhar, Sr. Partner Solutions Architect – AWS
By David Rocha, Sr. Partner Solutions Architect – AWS

SAP applications, considered the backbone for many enterprises, can have their availability enhanced through the use of SUSE solutions on AWS. SAP systems run critical business processes, functional and technical modules to meet their business goals. That’s why maximizing SAP applications availability is a common requirement for organizations. By reading this blog post, you will learn how customers increase their SAP applications availability with SUSE innovations on AWS.

SUSE allows customers’ SAP applications and SAP HANA databases to achieve greater availability with SUSE Linux Enterprise Live Patching (SLE Live Patching) and the Fast Dying SAP HANA indexserver HA solution.

Additionally, SLE Live Patching and the Fast Dying SAP HANA indexserver HA solution are solutions developed by SUSE. Each solution requires corresponding packages, patches, and security fixes provided from SUSE. Your AWS systems must be registered with one of SUSE’s supported repository services: SUSE’s Public Cloud Update Infrastructure, SUSE Manager, Repository Mirroring Tool, and SUSE Customer Center. Customers who have purchased SUSE Linux products from AWS by default connect to SUSE’s Public Cloud Update Infrastructure.

SUSE Linux Enterprise Live Patching

SLE Live Patching allows you to update SUSE Linux without restarting your systems. The patch sets are delivered as packages with modified code separate from the main kernel packages. The SLE Live Patching updates include all previous fixes, so you only need to install the latest update. 

SUSE expanded the scope of SLE Live Patching by introducing user space live patching (ULP). SLE Live Patching now offers a framework for live patching user space applications. This technology targets patching shared libraries at runtime.

SLE Live Patching is supported by SAP HANA and provides updates for high-severity security vulnerabilities and bugs that could cause system instability or data loss. SUSE uses a standard rating system, where high-severity vulnerabilities and bugs are rated 7 or higher. Not all vulnerabilities and bug fixes can have a SLE Live Patching patch set created. Currently, over 95% of qualifying fixes are released in the SLE Live Patching repository.

Why was SLE Live Patching developed?

SUSE, like all Linux operating systems vendors, release patches that include critical security updates and serious bug fixes. DevOps and System Administrators apply SUSE Linux kernel patches periodically to ensure that the systems are not exposed to security vulnerabilities and that the systems comply with the organizational security policies. Often, these critical security updates and serious bug fixes require a reboot, which necessitates activities that go beyond Linux patching. 

For mission-critical systems, the following activities are a common practice for DevOps, application teams, and system administrators when applying Linux kernel patches:

  1. Properly shut down the application
  2. Reboot the instances after applying Linux kernel or user space patches
  3. Restart the application
  4. Validate that the application is running correctly, as well as the systems that connect to SAP. 

If a high severity vulnerability or bug fix applies to your mission-critical systems, you are introducing additional risk by continuing to run your environment in an unpatched state. Waiting for quarterly maintenance windows only increases the risk by running in an unpatched state for critical security patches or bug fixes. Although Linux kernel patches are critical, the application unavailability caused by Linux kernel patching often receives push back from application and business owners.

SLE Live Patching maximizes application availability by providing DevOps and system administrators SUSE CVSS level 7+ Linux kernel patch sets that do not require a reboot. Before we dive deeper, it’s important to point out that the technology used for SLE Live Patching is different for SUSE Linux 12 than what is used for SUSE Linux 15. SUSE started using a new technology called Kernel Live Patching (KLP) in SUSE Linux Enterprise Server 15. For SUSE Linux version 12, the technology used for SLE Live Patching is kGraft.

As previously mentioned, SLE Live Patching includes ULP. ULP refers to the process of applying patches to the libraries used by a running process without interrupting the process. Applying an available user space live patch security fix will secure your application services without restarting the processes. Currently, ULP supports glibc and openssl libraries, which are critical dependencies of the SAP HANA database.

SUSE Linux 12 versions will be out of general support on October 31, 2024, so we will focus on SUSE Linux version 15. We strongly encourage customers to upgrade or move to a supported version of SUSE Linux 15. If you are interested in this topic, please read the following AWS blog: The Safe SUSE Upgrade: Avoiding Pitfalls When Upgrading AWS Instances

To activate SLE Live Patching on your system, you need an active subscription of SUSE Linux and SLE Live Patching. If you acquired SUSE Linux Enterprise Server For SAP Applications (SLES For SAP) 12 SP3 or newer from the AWS Marketplace with AWS as the publisher, the launched instances will have SLE Live Patching enabled, and the repository will be pre-configured.

If you purchased SUSE Linux Enterprise Server For SAP Applications (SLES For SAP) 12 SP3 and later from the AWS Marketplace with AWS as the publisher, launched instances will have SLE Live Patching activated, and the repository will already be added. For more information about the images and SLE Live Patching, read the SUSE support article, Add Live Patching Repositories to Public Cloud on-demand Instances.

Fast dying SAP HANA indexserver HA solution

Now let’s discuss a recently added SUSE High Availability (HA) scenario for SAP HANA.

Fast dying SAP HANA index HA (FDSHI) solution is the new SUSE HA scenario addresses an SAP HANA indexserver service that is overloaded or continuously crashes. An SAP HANA indexserver service that is unstable often delays a failover scenario, which can take upwards of 90 minutes to failover.

Why was Fast dying SAP HANA indexserver HA solution developed?

SUSE identified two primary root causes for the long failover. The first was a software failure caused by one or more HANA processes to be restarted in place by the HANA daemon (hdbdaemon) which included the HANA indexserver. The second was a hardware error that caused HANA indexserver (hdbindexserver) to restart locally. 

Both of these failures will result in the HDBDaemon reporting the status of SAP HANA system as ‘Running but status info unavailable’. This results in the ‘Yellow’ status of the service, which the previous SUSE HA solution could not account for. This is because the SUSE HA solution is designed to initiate a failover on failed services or a ‘Red’ condition. Until the new was introduced, the SUSE HA solution could not address a ‘Yellow’ condition. SAP has published a knowledge base article outlining the condition, 2431472 – HDBDaemon status on SAP HANA system shows ‘Running but status info unavailable’. The article requires an active SAP Support Portal user ID.

Monitoring state changes of SAP HANA indexserver

FDSHI resolves the condition when the HANA Index server is in the ‘Running but status info unavailable’ and drastically reduces the downtime created by the condition. The FDSHI uses the SAP HA/DR provider hook method srServiceStateChanged to monitor the states. The hook method srServiceStateChanged handles all status changes of any SAP HANA service. When there is a status change, the nameserver service is responsible for updating the status using the SAP HA/DR provider API.

SUSE’s solution has leveraged the status changes in developing the FDSHI. SUSE developed the susChkSrv.py hook script to monitor the state changes reported by the nameserver service. The hook script susChkSrv.py is called on for srServiceStateChanged() events. The script checks for the status of the HANA Index server and compares the current service state to the previous current state to determine if the indexserver service is overloaded or is continuously crashing. If susChkSrv.py determines the failed state, the SUSE HA cluster will take action based on the HANA system replication status and the SAPHana resource agent configuration parameters: PREFER_SITE_TAKEOVER and AUTOMATED_REGISTER.

Conclusion

In this blog, we discussed two SUSE innovations. The first innovation was SLE Live Patching. It provides your SAP environment increased availability when applying critical patches to both the SUSE Linux kernel and user space libraries. The second innovation was Fast dying SAP HANA indexserver HA solution. It decreases the failover time for a SUSE HA cluster by identifying an SAP HANA indexserver degraded state and initiating a failover.

Additional Resources

For more information about the SLE Live Patching and the Fast Dying SAP HANA indexserver HA solution, visit the following resources:

.


SUSE – AWS Partner Spotlight

SUSE is an AWS Advanced Technology Partner, AWS Marketplace Seller, and AWS Competency Partner that is a pioneer in open source software, provides reliable, software-defined infrastructure, and application delivery solutions.

Contact SUSE | Partner Overview | AWS Marketplace