Networking & Content Delivery
Migrating SD-WAN Appliances to AWS Transit Gateway Connect
Introduction
Since its launch in 2020, AWS Transit Gateway Connect has provided a native way for you to connect third-party SD-WAN appliances to an AWS Transit Gateway. Connect attachments use Generic Routing Encapsulation (GRE) tunnels and Border Gateway Protocol (BGP) to exchange routes between the Transit Gateway and an appliance.
Prior to Transit Gateway Connect, you had two options for integrating your SD-WAN solution into a transit gateway: AWS Site-to-Site VPN attachments and VPC attachments. These architectures are described in Reference Architectures for Implementing SD-WAN Solutions on AWS. While these designs are valid and useful for some scenarios, migrating to Transit Gateway Connect offers additional benefits:
Transit Gateway Connect vs. Site-to-Site VPN
- Higher bandwidth: a single Transit Gateway Connect attachment can support up to 20 Gbps, while a Site-to-Site VPN connection only supports up to 1.25 Gbps.
- Higher MTU: Connect attachments support an MTU of 8500 bytes, which is higher than the Site-to-Site VPN MTU (1500 bytes).
- Simplified connectivity: Connect attachments use GRE for encapsulation, removing the IPsec overhead associated with multiple VPN tunnels.
Transit Gateway Connect vs. VPC attachment
- Dynamic routing: Connect attachments use BGP for dynamic routing instead of static routes in VPC and transit gateway route tables, eliminating the need for manual route table updates or custom automation.
- Improved monitoring: Connect attachments are automatically included in AWS Transit Gateway Network Manager, which provides visibility into routing updates and other network events.
- Improved availability: Transit Gateway Connect supports equal-cost multipathing (ECMP) with a 5-tuple hash – protocol number, source IP address, destination IP address, source port number, and destination port number. This allows your traffic to be distributed evenly across multiple appliances, reducing the impact of a single appliance failure compared to the one-appliance-per-AZ approach with VPC attachments.
(Note that Transit Gateway Connect attachments are different from AWS Direct Connect, which enables you to establish dedicated network connections between your network and an AWS Direct Connect location. You can, however, use both Direct Connect and Transit Gateway Connect together, as described in the Simplify SD-WAN connectivity with AWS Transit Gateway Connect blog post.)
In this post, we’ll provide you with the background and necessary steps to migrate these SD-WAN architectures to Transit Gateway Connect successfully, along with considerations to keep in mind when migrating. To work, your appliance must meet the requirements for Transit Gateway Connect attachments.
Understanding Transit Gateway Route Table Behavior
Before diving in, it’s important to understand how transit gateway route tables work with Connect attachments. This will impact how you approach the migration process.
- When a Connect attachment is associated to a route table, the transit gateway will advertise the routes within that route table to the appliance.
- Routes advertised from the appliance will only appear in a route table if they are propagated into that route table.
When you create a transit gateway, you have the option to select a default route table association for new attachments. You can also choose to propagate routes from new attachments to that default route table automatically. Both options can be changed after creating the transit gateway, if needed. In the examples that follow, default route table association and default route table propagation are disabled. Your transit gateway may be configured differently, so be sure to check your environment in the Amazon VPC console before starting.
It’s also worth taking a moment to review how Transit Gateway evaluates routes and in what order. For this post, there are three takeaways to remember as we walk through the migration steps:
- The most specific route for a destination address is always preferred
- Static routes are preferred over propagated routes
- Transit Gateway Connect propagated routes are preferred over VPN propagated routes
You can look at the transit gateway route tables to see which route the transit gateway prefers.
Scenario 1: Migrate a dynamic VPN to Transit Gateway Connect
In this architecture, your SD-WAN appliance sits in a dedicated VPC (“appliance VPC”) and establishes Site-to-Site VPN connections to the transit gateway. The appliance uses BGP to exchange route prefixes with the transit gateway. Depending upon your bandwidth requirements, you may be using ECMP to aggregate multiple VPN tunnels together for higher bandwidth. In the following diagram (Figure 1), we have an existing appliance that we want to migrate to using Transit Gateway Connect.
To begin, create a transit gateway attachment to the VPC with the SD-WAN appliances. If you have unallocated IP space in the VPC, it’s a best practice to create separate subnets for each transit gateway VPC attachment. Once you have attached the VPC, you can create the transit gateway Connect attachment using the previously created VPC attachment as the transport or underlay (Figure 2). Remember, you must also create a route in the appliance VPC route tables for the transit gateway CIDR block, with the transit gateway as the target.
With the Connect attachment in place, you can create the Connect peers (GRE tunnels), specifying the private IP address associated with the ENI attached to the appliance instance. Because your appliance is already using a VPN connection, you can specify the existing peer ASN (eBGP) on the appliance. If you must use the same ASN (iBGP), you must tear down the existing VPN connections first so you can update the ASN on the appliance. Alternately, you can deploy new appliances. Once you configure the Connect peers on the transit gateway, you can configure the GRE tunnels on the appliances.
At this stage, you must configure the Connect peer BGP sessions on the appliance. Figure 3 shows what will happen by default once BGP is activated and you associate the Connect attachment and propagate the routes:
Remember the Transit gateway route evaluation order? If you activate propagation from the Connect attachment into the spoke VPC route table, you’ll see the Connect route instead of the VPN route in the transit gateway route table. Remember that the transit gateway route table will only show the preferred route – in this case, the Connect peer – for a destination CIDR block. Associating the Connect attachment to our appliance route table, meanwhile, means the appliance will begin receiving the same routes from both the Connect peers and the VPN tunnels.
In other words, egress traffic from the transit gateway to the appliance will prefer the Connect peer (GRE tunnel). Ingress traffic from the appliance will use either the Connect peers or the VPN, depending upon your appliance’s configuration.
In this scenario, you can use BGP attributes such as local preference or AS_PATH on the appliance to prefer the Connect attachments. This will also reduce the risk of traffic being dropped because of asymmetric routing for appliances performing stateful inspection. Alternately, you can shut down the VPN tunnels on the appliance. Keep in mind that because your appliance and the transit gateway have different ASNs (eBGP), you must configure ebgp-multihop with a time-to-live (TTL) value of 2 on the appliance. Figure 4 illustrates the resulting traffic flow when you turn on AS_PATH prepending on the routes received from the VPN tunnels:
Once you’ve tested and validated that your traffic is flowing as expected, the migration is complete. You can then safely turn off or remove the VPN tunnels and any BGP sessions associated with them. Remember, you won’t see any changes in the transit gateway route tables at this point. Figure 5 shows the final architecture after the completed migration.
Considerations:
- Make sure to update your appliance configuration if you want to take advantage of the Connect attachment’s higher MTU of 8500 bytes.
- If you need to revert the changes, remove the propagation from the Connect attachment into the transit gateway route table. You can then re-activate the VPN tunnels or readjust your BGP metrics to prioritize the VPN tunnels.
Scenario 2: Migrate an interface/VPC attachment to Transit Gateway Connect
With the VPC attachment option, your appliance VPC is already attached to the transit gateway. Instead of BGP, this option uses the appliance VPC route tables and the transit gateway route tables to route traffic to and from your SD-WAN appliances. Refer to Figure 6 for an example of this architecture.
As before, to begin the migration you must add the transit gateway CIDR block – if you don’t have one already – and then add a route to it in the appliance subnet route tables. Create the transit gateway Connect attachment using the existing VPC attachment to the transit gateway, and configure the Connect peers on the appliance. Figure 7 illustrates the architecture at this stage of the migration process.
If you haven’t configured BGP on the appliance before, you can choose between using the same autonomous system (AS) number for iBGP, or a different AS number for eBGP. If you choose to use iBGP on your Connect peers, you must ensure that downstream routes are advertised from an eBGP peer. Once you associate the transit gateway Connect attachment to the transit gateway route table, the appliance will begin receiving routes via BGP from the Connect peer.
In order to avoid asymmetric routing, you must turn on propagation from the Connect peer and remove the existing static routes to the appliance VPC in the transit gateway route table. Once you turn on propagation, the resulting routing behavior depends on the transit gateway route tables and the prefixes advertised from your Connect peers:
- For routes with the same destination prefix, the static routes to the VPC attachment are preferred. Routes propagated from the Connect peer do not appear in the transit gateway route table.
- If you’ve created a summarized static route to the VPC attachment, and the Connect peers advertise more specific prefixes, the more specific Connect prefixes are used instead.
You can look at the transit gateway route tables shown in Figure 8 to see which route the transit gateway prefers.
In our example, the appliance is advertising the same prefix (10.0.0.0/8), so you must remove the corresponding static route in the transit gateway route table. Within the VPC, delete the routes to the individual spoke VPCs in the public subnet, and the default route in the transit gateway subnet. Once you’ve validated that the transit gateway is receiving your routes as expected and traffic is flowing normally, the migration is complete (Figure 9).
Considerations:
- Transit Gateway Connect peers and VPC attachments support different bandwidth. The maximum bandwidth for a VPC attachment to the transit gateway is 50 Gbps, while each transit gateway Connect peer supports up to 5 Gbps per Connect attachment. If you require higher bandwidth, you can use ECMP and create up to 4 peers per Connect attachment for 20 Gbps total bandwidth, or create additional Connect attachments on the same transit gateway.
- If you need to revert the changes, recreate the static routes in the transit gateway route table and the appliance route table. You can then disassociate the Connect attachment, or turn off the Connect attachment peering connections on your appliances.
Conclusion
The patterns covered in this post walk through some of the considerations and routing challenges involved in the migration process. As with any migration, you should develop a plan to implement, test, and rollback as necessary during a scheduled maintenance window. To learn more about Transit Gateway Connect support from SD-WAN and Networking partners, visit the Transit Gateway Partners page. If you need help with planning a migration, talk to your AWS Technical Account Manager (for Enterprise Support customers) or Solutions Architect.