Performing a tabletop exercise with Amazon Connect

When a contact center experiences an unexpected service disruption, the impact can be immediate and severe: agents unable to access systems, customers facing connection issues, and support teams working to rapidly restore service. While such scenarios may seem extreme, they represent exactly the kind of situations that make tabletop exercises essential for modern contact center operations.

In this blog post, we will walk you through a few tabletop exercises with Amazon Connect, the AI-powered cloud contact center solution by Amazon Web Services (AWS). Afterward, you should understand what and how to check different scenarios with your contact center migration that will help you update your processes and procedures to get ready for the unexpected.

An introduction to the tabletop exercise

Tabletop exercises are critical for companies of any size as they provide a structured and controlled environment to simulate real-world scenarios, allowing organizations to test their response plans for crises such as cyberattacks, natural disasters, or system outages. These exercises involve key stakeholders, such as senior executives, department heads, and crisis management teams, in strategic discussions and decision-making processes. By running through potential threats and disruptions, enterprises can identify gaps in their emergency procedures, improve communication and coordination, and ensure that all teams understand their roles during an actual event.

Tabletop exercises also help organizations refine their risk management strategies, increase organizational resilience, and build confidence in their ability to respond swiftly and effectively when faced with unexpected challenges, ultimately safeguarding both operations and reputation.

1) Amazon Connect service interruption

Scenario: Amazon Connect experiences a temporary service outage affecting your contact center’s ability to make or receive calls for a specific region. Customers are unable to reach support agents, and agents cannot access the system to handle calls.

Key objectives:

Verify whether the issue is isolated to your Amazon Connect instance or affects multiple regions
Check AWS PHD for any service issues
Use CloudWatch metrics to investigate potential causes (e.g., connectivity issues, API limits)
Determine how to notify end-users and stakeholders about the outage and the estimated recovery time
Implement a workaround, such as redirecting calls to an alternative system or using a different region

2) Poor quality or latency

Scenario: A batch of customer calls experiences high latency and degraded audio quality, leading to poor customer experiences. This issue appears to affect calls between specific agents and customers in certain regions.

Key objectives:

Investigate network performance using Amazon Connect voice metrics and Amazon CloudWatch
Check the status of any third-party system involved (eg. SIP Trunking or telephony integrations)
Identify whether the issue is related to regional infra or network congestion
Test voice quality using Amazon Connect Voice Analytics tool and other diagnostic tools like CloudWatch logs
When contacting AWS Support, prepare a HAR file/console logs to assist in Support Troubleshooting (see: https://repost.aws/knowledge-center/support-case-browser-har-file)

3) Contact flow misconfiguration

Scenario: A newly updated contact flow is preventing customers from being routed to the correct department. Instead of reaching the intended agent, customers are disconnected or placed in an incorrect queue.

Key objectives:

Check CloudWatch metrics for “ContactFlowErrors” and “ContactFlowFatalErrors“
Identify an impacted Contact and review the Contact Flow Log in Cloudwatch for more information on the error
Walk through the contact flow to identify misconfiguration (e.g., incorrect conditions, wrong queue routing)
Implement a rollback plan or hot fix to restore functionality if a configuration change caused the issue
Test and validate the contact flow after changes to ensure the issue is resolved
Document and communicate the change to the team and end-users

4) Amazon Connect Instance is running slowly

Scenario: The Amazon Connect instance is experiencing delays in agent login times, call routing, and administrative dashboard performance. This is affecting both the customer and agent experience.

Key Objectives:

Monitor Amazon CloudWatch to track system performance and resource usage
Check for any concurrent usage spikes or Amazon Connect service limits
Identify if the issue is related to high traffic volumes or any configuration errors
Take corrective action, such as scaling up resources or adjusting usage patterns

5) AWS Lambda function timeout

Scenario: A Lambda function used for custom integrations (e.g., fetching customer data from a database) is timing out and causing delays in call handling, resulting in dropped calls or delayed information retrieval.

Key Objectives:

Review the Lambda function logs and identify why the timeout is happening (e.g., slow database queries, function logic errors)
Test the Lambda function manually outside of Amazon Connect to check for performance issues
Modify the Lambda timeout settings and adjust the function code to improve performance
Investigate if the Lambda function is hitting AWS service limits or encountering resource constraints

6) Issues with Customer Relationship Management (CRM) integration (eg. Salesforce, ServiceNow)

Scenario: Calls initiated from Amazon Connect are failing to populate customer data into the CRM, leading to agents being unable to access necessary information when assisting customers.

Key Objectives:

Investigate if the integration API between Amazon Connect and the CRM system is functional
Check for any recent updates or changes to the CRM or Amazon Connect integration that might have caused the issue
Verify the authentication and authorization settings between the two systems (e.g., API keys, IAM roles)
Test a manual connection between Amazon Connect and the CRM and review logs for any errors

7) Routing profile misconfiguration

Scenario: Agents are incorrectly assigned to routing profiles, and as a result, they are receiving calls from the wrong queues, or they are not receiving any calls at all.

Key objectives:

Review the routing profiles and make sure they are assigned to the correct agents
Ensure that queue memberships and skills are properly configured to route calls appropriately
Test the routing logic with different agents to verify that calls are routed correctly
Adjust any discrepancies in the configuration and monitor the system for improvements

8) High call abandonment rate

Scenario: There is an unusually high rate of abandoned calls in the contact center. Customers are hanging up before reaching an agent or while waiting in queue.

Key objectives:

Monitor the queue lengths and wait times using Amazon Connect metrics to see if they exceed acceptable thresholds
Review contact flow configurations and verify that estimated wait times are communicated clearly to customers
Check if there is a bottleneck in the agent availability or if the system is overwhelmed
Adjust the IVR or routing strategy to reduce wait times and provide more accurate waiting time estimations

9) Authentication failure for agents

Scenario: Agents are unable to log in to the Amazon Connect instance due to authentication errors, possibly caused by a misconfiguration in the identity provider (e.g., AWS SSO, Active Directory).

Key Objectives:

Verify that the integration between Amazon Connect and your identity provider is still valid
Check for recent changes to IAM roles, policies, or security settings that might affect agent login
Review CloudWatch logs for any authentication or permission errors
Test agent logins using different accounts to isolate whether the issue is account-specific or system-wide

10) Customer data inconsistencies

Scenario: Customer data stored in Amazon Connect or integrated systems (e.g., AWS Lambda, Amazon DynamoDB) is inconsistent across different customer interactions. For example, previous case notes or customer preferences are not appearing for agents.

Key objectives:

Verify that your data integration processes (e.g., Lambda, DynamoDB, Salesforce) are functioning correctly
Ensure that customer data is being correctly retrieved, stored, and updated in real-time during calls
Investigate any data synchronization issues that may be causing inconsistencies
Implement a fix to ensure consistent data retrieval for agents, and review logging and error handling for future data discrepancies

11) Multi-channel experience degradation

Scenario: The multi-channel experience for customers (voice, chat, and email) is degraded, and customers are experiencing inconsistent interactions across different channels.

Key objectives:

Review the integration between Amazon Connect and the other channels
Verify that contact flows are correctly set up for each channel and that customer data is shared across channels (e.g., passing context from voice to chat or vice versa)
Monitor CloudWatch metrics for all channels to identify if certain channels are underperforming
Test each channel (voice, chat, email) individually to confirm the nature of the degradation
Implement cross-channel escalation options (e.g., transferring a chat to a voice call) and ensure that customer context is preserved across channels

Conclusion

In this blog post, we walked you through 11 diverse tabletop exercise scenarios for Amazon Connect to provide you with a comprehensive approach to troubleshooting and resolving common issues that may arise in a contact center environment. By walking you through the detailed steps to address each challenge, we hope you have gained valuable insights into effectively managing Amazon Connect capabilities.

These exercises not only reinforce problem-solving techniques but also emphasize the importance of proactive planning, testing, and ongoing monitoring to maintain seamless operations. Whether you’re contact center manager or technical expert, practicing these scenarios will help build the resilience and expertise necessary to tackle real-world issues confidently and efficiently.

Remember, continuous improvement through exercises like these ensures that your Amazon Connect environment remains robust and ready to provide exceptional customer experiences.

Renato Gentil

Renato is a Senior Technical Account Manager based in Ireland with over 7 years of experience with AWS. Renato holds 4 AWS certifications and he has been working on large scale resilience projects with different customers around the globe, helping them improve their resilience posture with tabletop exercises, and incident management process.

Select your cookie preferences

AWS Contact Center

Performing a tabletop exercise with Amazon Connect

An introduction to the tabletop exercise

1) Amazon Connect service interruption

2) Poor quality or latency

3) Contact flow misconfiguration

4) Amazon Connect Instance is running slowly

5) AWS Lambda function timeout

6) Issues with Customer Relationship Management (CRM) integration (eg. Salesforce, ServiceNow)

7) Routing profile misconfiguration

8) High call abandonment rate

9) Authentication failure for agents

10) Customer data inconsistencies

11) Multi-channel experience degradation

Conclusion

Renato Gentil

Resources

Follow