AWS Cloud Operations Blog
Achieve testing success with the AWS Application Migration Service for painless and simplified cut-overs
Many customers utilize the AWS Application Migration Service to rehost their environments to AWS. A crucial aspect of testing for successful migration involves identifying dependencies on shared services. Connectivity to Microsoft Active Directory, integrations with other services and backup servers could cause unexpected behaviors during your testing and cut-over.
This blog post will focus on identifying some of the common pitfalls to avoid for a successful migration using the Application Migration Service.
The Application Migration Service uses continuous block-level replication technology making it easy to Re-host / Lift and Shift servers to AWS. Block-level replication replicates the disks of your source server and all the applications + data on them exactly as is. If the server is exactly the same as it was on-premise, why do we need to do any testing?
Even though the replication is byte for byte, the server will run in a different environment, so it is important to test your application. Testing is such an important step that the Application Migration Service prevents a final cutover until you’ve gone through a testing cycle.
AWS Landing Zone
In most of the situations we’ll cover, we are assuming you have either an AWS Site-to-Site VPN or AWS Direct Connect to at least one on-premise datacenter. This allows your AWS servers to communicate with your on-premise servers, which may be required for applications that have dependencies on services within the on-premises datacenter. Please note that this connectivity is not required for the Application Migration Service to function.
What are the server changes in a Rehost / Lift and Shift when using the Application Migration Service?
The Application Migration Service runs a post-launch configuration to automatically set the instance to use DHCP and install AWS-specific drivers among a few others. We won’t discuss this process in detail in this blog post. Instead, we’ll focus on the other changes that happen in the environment during a migration.
Connectivity to Live Systems during Testing
When connecting to test instances, it is important to note that test instances have the same name as the source production server; double check that you’re connected to the test instance before making changes or running tests.
Many applications do not run in isolation but instead rely on data flows between systems. For instance, does your solution pick up work from a queue or maybe just process files from an SFTP? When putting a replica of that system online without a firewall, both the production and test instances will process some of this data. If the replica deletes the queue entries or files once processed, the production system will miss data, creating data inconsistencies. Recovering from this might be time consuming and problematic. Additional difficulties arise if that interface is part of an external system run by a partner where joint effort might be required.
It is important to consider how well you know a system. Undocumented application processes expose businesses to additional risk, particularly if some of the surrounding knowledge has left the organization. We recommend using the AWS Application Discovery Service to help with the discovery of network communication patterns.
Dynamic DNS updates
Another issue, typically with Windows servers, is dynamic DNS registration for a server name. By default, when a Windows instance comes online, it registers its new IP in Windows DNS by overwriting the previous server IP address. You can see this configuration inside the Advanced TCP/IP Settings configuration on a Network Interface Card inside Windows:
Figure 1. Register this connection’s address in DNS causes Dynamic DNS updates
The production server registered itself against the Microsoft AD DNS and it correctly routes user requests to the production server.
Figure 2. Shows the DNS configuration for clients before any migration
After starting the Test Instance, DNS updates the IP address to redirect users to the test server instead of the production server.
Figure 3. In this example, after a test migration, the live Production servers’ DNS is overwritten so users now connect to a test instance in AWS
Users and applications will make new client connections to the test instance inside AWS and not the production server. This could result in failed connections if blocked by a firewall, or they might connect, unaware they are now working on a test instance. If this server stores persistent data, you now need to migrate the data into the production version of the application. DNS caching and replication can cause additional problems where some users may connect to the production instance while others connect to the test instance. Without a merge between test and production, neither server will have all user data.
To avoid this, you can override Windows DNS by creating a static DNS entry rather than dynamic and revert to dynamic afterwards. You can also use an AD GPO to disable dynamic DNS registration for the servers in your migration wave, which can then be re-enabled after cutover.
This does not mean to avoid using DNS instead of connection strings in your applications. DNS in connection strings simplifies cutovers by eliminating the need to find and update IP addresses. You only need to validate that the DNS entry is correctly updated after the cutover. This is being mentioned as a consideration for this situation to avoid the problems it might create in your environment.
Sandboxes
A testing sandbox can help to avoid some of the issues we’ve mentioned. A sandbox landing zone is a segregated Amazon Virtual Private Cloud (Amazon VPC) that is often a cut down deployment of your production Amazon VPC inside AWS.
Blocking communication in and out of this sandbox prevents the servers from updating live DNS or processing production data. You might also wish to block internet connectivity to avoid issues with applications that communicate with remote third parties.
There are some catches to this approach that might affect your testing, which we will explain next and how you might work around them.
Connectivity into the sandbox
Your test instances cannot have any effect on your production systems in a fully isolated Sandbox. You will, however, be unable to connect to them in order to perform your tests and ensure that the workloads are operational. To achieve this, you have quite a few options. We’ve put them in order of what we recommend, but all are valid:
- AWS Systems Manager to connect a session into servers using the AWS portal or AWS CLI.
- Deploy a Client VPN into the Sandbox Amazon VPC to connect.
- Deploy a Bastion host and open this to your specific public IP addresses only. Do not publish on the wider internet.
- Deploy migrated servers into the Sandbox Amazon VPC with a non-overlapping CIDR and peer with your production Amazon VPC. Create security groups which only allow inbound RDP and SSH communication.
Active Directory with a sandbox
This is mainly a Windows issue, but Linux servers can also be Active Directory joined. Many servers will not function correctly if cut off completely from Active Directory. You can still login using a local account, but server services may require an Active Directory service account to start or function. Ideally, we want to avoid making too many changes to a test instance so that it represents the source environment.
This can be solved by using the Application Migration Service to replicate and provision Active Directory domain controllers inside the sandbox. You should have more than one in your environment, so bring up a pair, ideally the FSMO role holders, into the sandbox. A single DC will take a long time to boot before failing with many errors when it’s unable to communicate with other domain controllers. Domain functionalities are maintained by bringing up a pair, even in the presence of multiple domain replication errors.
Another option is to deploy additional domain controllers in the sandbox by connecting them to the production Active Directory and allowing them to synchronize.
Once the synchronization is complete, remove network connectivity and isolate the sandbox. You can then either clean these up from the production domain manually or, ideally, re-establish connectivity and cleanly demote them after all testing is complete.
The sandbox domain controllers should have completed a recent Active Directory replication to keep passwords and computer object tokens valid.
Testing application integrations inside a sandbox
Once we’ve segregated our test servers, test integrations with servers outside the segregated Amazon VPC are not possible. This can be a concern, but keep in mind that the migrated server is a copy of your live server, byte for byte. Since the software, configuration, and identity of the migrated server won’t change, the live cutover shouldn’t be a problem.
Firewalls are important to mention because the migrated server’s IP address will change when you cut over, meaning existing rules to allow connectivity will no longer be valid. To reduce this risk, deploy a blank Amazon Elastic Compute Cloud (Amazon EC2) instance in the segregated Amazon VPC for your live cutover. Once the server is deployed, set it up to use the same IP address as your cutover server.
With this test instance, you can use ncat on Linux or Test-NetConnection in PowerShell on Windows to check TCP IP and port connectivity.
Linux Example
1. nc -zv thinkwithwp.com 443
2. Connection to thinkwithwp.com (x.x.x.x) 443 port [tcp/https] succeeded!
Windows Example
1. Test-NetConnection -ComputerName thinkwithwp.com -Port 443
2. ComputerName : thinkwithwp.com
3. RemoteAddress : x.x.x.x
4. RemotePort : 443
5. InterfaceAlias : Ethernet
6. SourceAddress : x.x.x.x
7. TcpTestSucceeded : True
This lets you set up firewall rules before the migration and test them to make sure they work as expected without affecting live workloads or datasets. You can then proceed with the live cutover, knowing that there are no connectivity issues.
It’s recommended to perform compatibility tests with third-party services, as your network’s public IP addresses will change once you migrate to AWS. To find the new public IP addresses, think about how network packets move from your Amazon EC2 instance to the Internet. Usually, you will find them assigned to an Internet gateway in your egress Amazon VPC.
Changing the firewalls of a third party might take longer, so this should be done before the cut-over. Having them add your new public IP addresses along with your old ones is a simple way to do this. After the cut-over completes and you are sure you don’t need to roll-back, you can ask them to remove your old public IPs.
Automated Testing / Synthetic Testing
Consider implementing basic automated testing for all workloads to validate that a server and service are running successfully. It is beneficial to check logs for keywords to report to your operations teams as well.
But as systems have become more complex and interconnected, these methods don’t catch problems as often as they used to. For instance, for an end user to submit an order on a website, that might cross more than one of your systems. A synthetic test that submits a test order, verifies a confirmation email, and validates your end-to-end business process is a more effective testing method. A migration can be a great time to add in some of these, even if you’re just touching the corners.
The Application Migration Service now supports custom post-launch actions that use AWS Systems Manager documents to run automations on your migrated server at first boot. This allows you to define a set of actions to take, including automated testing. This removes human error while also saving time per server for a faster and smoother switchover process.
Turn off services on source servers before a cutover
The Application Migration Service will replicate any remaining data before the final cutover. This is done by copying the disc image block by block. We recommend stopping all services on the source servers so that end users and other automated processes or data flows are not running when this happens. This ensures that nothing changes after the final replication and allows you to test everything before restarting the services and letting users reconnect to the server.
Backup, monitoring, and other agents
We often deploy software agents on servers to perform tasks like backups and monitoring. When moving to AWS, these agents may no longer be needed, as you can take advantage of cloud-native AWS services to perform the same tasks. For example, we could use Amazon CloudWatch instead of existing monitoring agents, and AWS Backup could take the place of existing backup agents.
Which solutions to use depends on many factors and your business requirements, but it’s important to think about them when planning the migration. Consider the following before deciding to keep your existing on-premise backup solution after the cutover:
- The backups leaving the AWS network will add to your AWS bill as egress networking charges that could be avoided.
- The bandwidth capacity from AWS to your on-premise environment may be less than when the server was on-premise. This could make the backup window much longer or even stop the backup from working.
Organization security and compliance
Your company may have security and compliance rules that prevent accounts from deploying Amazon EC2 instances unless they use AMIs that are approved. This could interfere with the Application Migration Service, making it difficult to successfully move servers.
During the planning phase of the migration, talk to your security and compliance teams about the migration requirements. This will allow the approvals and exceptions to be made before the migration starts.
Microsoft Active Directory on Amazon EC2 using private static IPs
AWS Directory Service for Microsoft Active Directory simplifies the way you manage and globally scale Microsoft Active Directory infrastructure. But if you prefer to use your own Amazon EC2 instances to run Microsoft Active Directory, you will need to set up static private IPs.
Post-migration optimization
Consider using additional AWS tooling like the AWS Compute Optimizer, AWS Trusted Advisor and the AWS Well Architected Tool after the cutover. These tools help to ensure the migration is secure and cost efficient.
Summary
In this post you learned how to reduce migration testing risks. We discussed the benefits of using a sandbox landing zone to segregate test environments from production infrastructure to avoid DNS and AD issues. We also talked about the importance of implementing automated testing for business applications and processes. Finally, we mentioned the potential high costs associated with having migrated servers use on-premise backup infrastructure.
This should get you thinking about your own environments and how to avoid problems when testing migrations.
Next Steps
Have a look at our AWS Application Migration Service Best Practices for additional advice. Good luck with your migration, and if you need additional help, the AWS Marketplace provides a broad Partner Network with migration solutions.
About the Authors: