AWS Database Blog

Troubleshoot network connectivity to Amazon RDS Custom databases using VPC Reachability Analyzer

Amazon Virtual Private Cloud (Amazon VPC) enables you to provision a logically isolated section of the AWS Cloud where AWS resources such as Amazon Relational Database Service (Amazon RDS) Custom DB instances can be launched in a virtual network you define. When creating an Amazon RDS Custom DB instance, you have the option to create a new VPC or select the default VPC for provisioning the DB instance.

One common scenario is to set up a VPC that contains an RDS Custom DB instance and corresponding Amazon Elastic Compute Cloud (Amazon EC2) instance. When provisioning RDS Custom DB instance, a VPC with incompatible settings would result into an Incompatible-create or Incompatible-network issue. In this scenario, the instance doesn’t get created and it becomes difficult to trace and fix the issue at the VPC level. In this post, we guide you through the process of initial diagnosis using Reachability Analyzer manually via the AWS Management Console or through a script. We also demonstrate validating the VPC setup requirement for a successful DB instance creation.

Solution overview

Reachability Analyzer is a feature in Amazon VPC that can help you check network reachability between a source and destination resource on AWS. When the destination is reachable, Reachability Analyzer produces hop-by-hop details of the virtual network path between the source and the destination. When the destination isn’t reachable, Reachability Analyzer identifies the blocking or missing configurations.

The following are the high-level steps to troubleshoot connectivity issues for your RDS Custom DB instances:

  1. Find the subnet and VPC security group associated with the RDS Custom database endpoint.
  2. Create the EC2 instance for all the subnets associated with the subnet group in Step 1.
  3. Check the IP address of the target AWS service.
  4. Use Reachability Analyzer to create a network path between a source resource (for example, an EC2 instance created with associated subnets to the database endpoint) and a destination resource (for example, the IP address of the endpoint service).
  5. Run the network path analysis and review the results to understand or troubleshoot network reachability.

The following VPC endpoints are required for your DB instance to communicate with dependent AWS services:

com.amazonaws.region.ec2messages
com.amazonaws.region.events
com.amazonaws.region.logs
com.amazonaws.region.monitoring
com.amazonaws.region.s3
com.amazonaws.region.secretsmanager
com.amazonaws.region.ssm
com.amazonaws.region.ssmmessages

The RDS Custom for SQL Server DB instance needs to communicate with two additional AWS services:

com.amazonaws.region.sqs
com.amazonaws.region.ec2

In this post, while demonstrating the manual process, we are checking the connectivity for only one of the endpoints. However, you can repeat this step for the other endpoints as well. The script given in the end of the post can do it automatically for the rest of the endpoints.

Prerequisites

For this exercise, we use the subnet group to identify the subnets associated for your RDS Custom database to test the network path from the subnet to the VPC endpoint by using the network interface. Verify that you have the required AWS Identity and Access Management (IAM) permissions for Reachability Analyzer for the IAM user or IAM role that you are using.

Find the subnet group associated with the RDS Custom database

To find your subnet group, complete the following steps:

  1. On the Amazon RDS console, navigate to your database.
  2. On the Connectivity & Security tab, locate the subnet group.
    Subnet group for RDS Custom for Oracle
  3. Navigate to the subnet group details page and take note of the subnet IDs associated with the subnet group.
    Subnets from the Subnet Group

Create the EC2 instance for the subnets associated with the subnet group

If you already have an existing EC2 instance with subnets and a VPC security group that you intend to use for your RDS Custom database, you can skip this step. To provision a new EC2 instance follow launch documentation.

While provisioning the AWS EC2 instance make sure you use all the subnet IDs associated with the RDS Custom subnet group and VPC security groups that you intend to use for the RDS Custom instance.

Check the IP address of the target AWS service

From the EC2 server, use the following command to find the IP address:

nslookup <service-endpoint>

The following screenshot shows our sample output.

Specify the source and destination in Reachability Analyzer

Reachability Analyzer allows you to specify various source and destination resources between which you can analyze network reachability. To specify your source and destination, complete the following steps:

  1. On the Amazon VPC console, choose Network Manager in the navigation pane.
  2. Under Monitoring and troubleshooting in the navigation pane, choose Reachability Analyzer.
  3. Choose Create and analyze path.
  4. For Source type, choose Instances and enter the instance ID that you created earlier.
  5. For Destination type¸ choose IP Address and enter the IP Address fetched from the above step using “nslookup”.
  6. Provide the remaining details and choose Create and analyze path.

Multi-AZ peer host communication analysis (SQL Server only)

If you are creating an RDS Custom SQL Server DB instance with a Multi-AZ deployment, DB instance hosts that take part in replication need to communicate on the port (1120). You can validate multi-AZ replication reachability using Reachability Analyzer. For this, specify one of the “EC2 Instance” created in step 2 as source and specify another ”EC2 Instance“ created in step 2 as the target. The protocol should be set as TCP and the destination port should be set as 1120. Repeat this process between all the ENIs created for each of subnets in the SubnetGroup.

Review the result of the analysis task

To view your successful network connection on the Amazon VPC console, navigate to Reachability Analyzer. Choose the path ID and the correct analysis ID.

The following screenshot shows the analysis explorer page.

In case of failure, Reachability Analyzer identifies the blocking component. You have detailed visibility into the causes of the connectivity failure, which allows for quicker troubleshooting and resolving the issue. The following screenshot shows the connection failure in the flow.

You can also run this analysis in an automated manner using the following script which supports both RDS Custom for Oracle and RDS Custom for SQL Server. You need to have AWS Command Line Interface (AWS CLI) v2 installed and configured on the EC2 host you configured for this earlier.

Deploy the solution using Bash script

  1. Copy the below script in to a new text file.
  2. Change permissions using chmod.
  3. Execute the script as shown in example.
/<DIRECTORY-LOCATION-OF-FILE>/<FILENAME>.sh <ENGINE_NAME>
Example: /home/ec2-user/VPCAnalyzer.sh Oracle (Options: Oracle or SQLServer)

Code

#!/bin/bash
#-----------------------------------------------------------------------------------------
show_usage() {
echo ""
echo "Script Usage : $0 ENGINE_NAME"
echo ""
echo " ENGINE_NAME : Provide engine name either Oracle or SQLServer"
echo ""
echo "script execution failed argument(S) missing"
echo ""
exit 1
}

#------------------------------- Send Mail ----------------------------------------------

# Input validation
if [ $# -eq 1 ]; then
Engine=$1
else
show_usage
fi

HOME_DIR=` echo $HOME `
mkdir -p $HOME_DIR/logs
LOGFILE=$HOME_DIR/logs/VPC_Analysis_for_RDSCustom_$$.log
exec 1> >( tee "${LOGFILE}" ) 2>&1
mkdir -p $HOME_DIR/successful_logs
mkdir -p $HOME_DIR/failed_logs
region=$(sudo aws configure get region)
Success=()
fail=()
niaids=()
niaidf=()

if [ -z $region ]
then
echo "Error: Couldn't determine the region information using 'sudo aws configure get region'"
exit
fi

echo "$(date +\"%F\ %T\") * running as $(whoami)" >> $HOME_DIR/logs/bootstrap_$$.log
Engine=$1

if [ $Engine == "Oracle" ]; then
vpc_endpoints=("ec2messages.$region.amazonaws.com" "events.$region.amazonaws.com" "logs.$region.amazonaws.com" "monitoring.$region.amazonaws.com" "s3.$region.amazonaws.com" "secretsmanager.$region.amazonaws.com" "ssm.$region.amazonaws.com" "ssmmessages.$region.amazonaws.com")
elif [ $Engine == "SQLserver" ]; then
vpc_endpoints=("sqs.$region.amazonaws.com" "ec2.$region.amazonaws.com" "ec2messages.$region.amazonaws.com" "events.$region.amazonaws.com" "logs.$region.amazonaws.com" "monitoring.$region.amazonaws.com" "s3.$region.amazonaws.com" "secretsmanager.$region.amazonaws.com" "ssm.$region.amazonaws.com" "ssmmessages.$region.amazonaws.com")
else
echo "Missing or Incorrect argument"
show_usage
fi

INSTANCE_ID=$(ec2-metadata -i | awk -F ":" {'print $2'})
ip_addresses=()

for vpc_endpoint in "${vpc_endpoints[@]}"; do
while [ -z "$ip" ]; do
ip=$(nslookup "$vpc_endpoint" | awk '/^Address: / { print $2 }'| head -1)
sleep 5
echo "{\"DestinationAddress\":\"$ip\",\"DestinationPortRange\":{\"FromPort\":443,\"ToPort\":443}}" > $HOME_DIR/source-filter_$$.json
ni_path_id=`aws ec2 create-network-insights-path --source $INSTANCE_ID --filter-at-source file://$HOME_DIR/source-filter_$$.json --protocol TCP --query 'NetworkInsightsPath.NetworkInsightsPathId' --no-cli-pager --output text`
nia_id=`aws ec2 start-network-insights-analysis --network-insights-path-id $ni_path_id --query 'NetworkInsightsAnalysis.NetworkInsightsAnalysisId' --no-cli-pager --output text`
echo -e "Analyzing for the Service: $vpc_endpoint"
echo -e "Network Insights Analysis ID : $nia_id"
STATUS=""
while [ "$STATUS" == "" ]; do
echo -e "Analysis in progress, waiting for 10 sec... "
sleep 10
STATUS=`aws ec2 describe-network-insights-analyses --network-insights-analysis-ids $nia_id --query 'NetworkInsightsAnalyses[].NetworkPathFound' --no-cli-pager --output text`
done

echo -e "Network Insights Analysis Status : $STATUS"
if [ "$STATUS" == "True" ]; then
echo -e "\nFound path to the destination"
aws ec2 describe-network-insights-analyses --network-insights-analysis-ids $nia_id --query '[ NetworkInsightsAnalyses[*].ForwardPathComponents[*].SequenceNumber, NetworkInsightsAnalyses[*].ForwardPathComponents[*].Component[].Id ]' --no-cli-pager --output table >> $HOME_DIR/successful_logs/Network-Analysis-Successful-ForwardPath-ORACLE-$nia_id-$vpc_endpoint.log
aws ec2 describe-network-insights-analyses --network-insights-analysis-ids $nia_id --query '[ NetworkInsightsAnalyses[*].ReturnPathComponents[*].SequenceNumber, NetworkInsightsAnalyses[*].ReturnPathComponents[*].Component[].Id ]' --no-cli-pager --output table >> $HOME_DIR/successful_logs/Network-Analysis-Successful-ReturnPath-ORACLE-$nia_id-$vpc_endpoint.log
Success+=($vpc_endpoint":"$nia_id)
else
echo -e "\nNo Path Found to the destination"
aws ec2 describe-network-insights-analyses --network-insights-analysis-ids $nia_id --query 'NetworkInsightsAnalyses[*].Explanations[*]' --no-cli-pager --output table >> $HOME_DIR/failed_logs/Network-Analysis-Failed-ORACLE-$nia_id-$vpc_endpoint.log
fail+=($vpc_endpoint":"$nia_id)
fi

done
ip=""
done

echo -e "\nFinal Summary Report:\n**************************************\n"

for endpoint in "${Success[@]}"
do
vpc_endpoint=` echo $endpoint|awk -F ':' '{print $1}' `
niaid=` echo $endpoint|awk -F ':' '{print $2}' `
echo "Success: $vpc_endpoint --> Click on this link: https://$region.console.thinkwithwp.com/networkinsights/home?region=$region#NetworkPathAnalysis:analysisId=$niaid"
done

for endpoint in "${fail[@]}"
do
vpc_endpoint=` echo $endpoint|awk -F ':' '{print $1}' `
niaid=` echo $endpoint|awk -F ':' '{print $2}' `
echo "Failed: $vpc_endpoint --> Click on this link: https://$region.console.thinkwithwp.com/networkinsights/home?region=$region#NetworkPathAnalysis:analysisId=$niaid"
done

if [ -f $HOME_DIR/source-filter_$$.json ]; then
rm -rf $HOME_DIR/source-filter_$$.json
fi

The following image shows a successful output example:

The following image shows a failure output example:

Deploy the solution using AWS CloudFormation

You can also deploy this solution using AWS CloudFormation in your account by completing the following steps.

  1. Download the CloudFormation Template and use either the AWS Console or AWS CLI to deploy the resources.
  2. The CFN template may take up to 15 minutes to deploy.
  3. Verify the completion of the stack deployment.

Once the deployment completes, you should see an EC2 instance resource created in the AWS Console.

During the AWS EC2 instance provisioning the script will run at EC2 user data and you can see the execution logs in ‘$HOME /logs’ directory.

Example: /home/ec2-user/logs/VPC_Analysis_for_RDSCustom_3194213.log

Clean up resources

If you no longer require this setup and want to avoid future charges, you can delete the resources that you created (namely, the EC2 instance and network analysis). To delete all other resources that were launched as part of the AWS CLI script, either use the console or the AWS CLI delete the resources. If you deployed the stack, you can use the AWS CloudFormation console or AWS CLI to delete the CloudFormation stack that you created earlier and remove the protection override for the RDS Custom instance and delete it.

Conclusion

The connectivity of an RDS Custom DB instance depends on multiple network resources, such as VPC security groups, route tables, and gateways. In the case of connectivity issues, manually checking all these resources can be difficult and time-consuming. Reachability Analyzer helps you do the following:

  • Understand network reachability and the network path between your source and destination resources in your VPCs
  • Troubleshoot network reachability issues caused by network misconfiguration between resources such as an RDS Custom database and a VPC endpoint

If you have any questions or suggestions about this post, leave a comment.


About the Authors


Sharath Chandra Kampili
is a Database Specialist Solutions Architect with Amazon Web Services. He works with the Amazon RDS team, focusing on commercial database engines like Oracle. Sharath works directly with AWS customers to provide guidance and technical assistance on the database projects, helping them improve the value of their solutions when using AWS

Ashutosh Bhardwaj is a Database Engineer in RDS DBS Managed Commercial Engines with Amazon Web Services. At AWS, he works primarily with Amazon RDS Oracle and RDS Custom for oracle. He is focused on designing and developing new features on RDS Oracle and RDS Custom to solve customer problem.