AWS for Industries

How Toyota migrated Its Safety Connect telematics services platform with virtually zero downtime to AWS

Toyota has made “connected cars” a key focus in its transition from an automotive company to a mobility company. In collaboration with its software and cloud services affiliate, Toyota Connected North America, it has developed an array of innovative technologies that help enhance customer safety and the overall vehicle ownership experience. One of these, Toyota and Lexus Safety Connect, is an in-vehicle service that provides real-time assistance by connecting drivers with an agent during roadside emergencies or accidents. It is powered by Toyota Connected’s Drivelink telematics services platform (Drivelink).

Automotive customer expectations have elevated in recent years, with many customers expecting connected services in their vehicles. As the services partner for Toyota, Toyota Connected’s migration to AWS for Safety Connect was driven by the need for improved system uptime and performance. Since the migration to AWS, Drivelink downtime has dropped dramatically to just one hundredth of a percent — underscoring the benefits of the migration. This reflects Toyota’s commitment to delivering a seamless and dependable connected-vehicle experience. This blog details how Toyota migrated Safety Connect with virtually zero downtime.

Drivelink mission and goals

Toyota Connected’s goal is to help build outstanding solutions that reimagine safety. Its Drivelink platform is also behind Destination Assist, which can help customers find their destinations by sending directions directly to their vehicle’s navigation system. The Drivelink platform is capable of ingesting and processing vehicle information it receives from the vehicle through cloud computing, using a remote server to process the data. Transmitted through a cellular network, this data can include packets such as voice calls as well as non-verbal data such as an airbag deployments and messages through the Controller Area Network (CAN) as well as voice calls. Call center agents are able to process the data they receive in whole in order to help drivers and passengers.

Through the Drivelink platform, Toyota is able to offer four primary services:

  • SOS button
  • Automatic Collision notification
  • Enhanced Roadside Assistance
  • Stolen Vehicle Locator

Toyota Connected estimates that its call center has received more than 5 million calls, of which 300,000 have been safety-related.

Overview

The migration of Drivelink platform to AWS was guided by the following principles:

  • No impact on the customer experience
  • Speed to market
  • Cost-efficiency at scale
  • Kaizen (Japanese term for continuous improvement)

Mechanisms for near-zero downtime migrations:

Achieving near-zero downtime migrations to the AWS cloud is critical for minimizing disruption to business operations. There are several mechanisms and best practices that Toyota Connected leveraged to help ensure a seamless migration while maintaining application availability:

1. Blue/Green Deployment: In a blue/green deployment, two identical environments are maintained—one (blue) is the existing live environment, and the other (green) is the new environment in AWS. Traffic is routed to the green environment after testing, while the blue environment remains untouched. If there are performance or latency issues, traffic can be easily switched back to blue.

2. Canary Releases: Instead of migrating all traffic at once, customers can gradually shift small portions of traffic to the new environment (AWS in this case). This helps detect issues before they affect all users.

3. Database Replication and Synchronization: Before switching fully to AWS, customers can continuously replicate data between on-premise databases and a database running on AWS to help keep them in sync. This helps ensure minimal data loss and guarantees up-to-date data during the cutover.

4. DNS routing: Once the application is fully migrated and tested, DNS routing can be updated to direct traffic to the new environment in AWS.

5. Phased migration: Customers can migrate data and applications in phases, starting with non-critical workloads, followed by databases and eventually critical workloads, allowing for gradual cutover.

Migration Architecture

This section describes the pre-migration architecture (prior to migrating to AWS), migration approach and the post-migration architecture.

Pre-migration Architecture

Figure 1. Toyota Connected’s pre-migration architecture to process vehicle data

As shown in figure 1, vehicle data coming over the cellular network is sent through Express routes and delivered to a service bus for downstream processing. Microservices within the Kubernetes platform process the incoming messages and store the processed data in MongoDB. CloudAMQP is used for asynchronous communication between components and Redis is used as an in-memory data store for caching frequently accessed data and storing temporary results.

The migration:
AWS provides a robust set of capabilities for a successful, accelerated migration. These capabilities include discovery, landing zone, security and compliance, skills and center of excellence, migration plan, and business case development. AWS Professional Services, a global team of experts, supported Toyota Connected for this migration.

The migration followed a six-step process, as depicted in figure 2.

Figure 2. TMC’s six-step migration processFigure 2. Toyota Connected’s six-step migration process

Toyota Connected adopted a phased migration approach, migrating one component at a time and meticulously testing each deployment. Initially, Toyota Connected transitioned its message queuing platform to CloudAMQP in AWS, followed by migrating MongoDB Atlas database nodes to Amazon Elastic Compute Cloud (Amazon EC2), which provides secure, scalable compute capacity for virtually any workload. Data was migrated using MongoDB’s native replication capabilities and kept in sync prior to the cutover. After performing successful validation, Toyota Connected decommissioned the old cloud virtual machines (VMs), enabling MongoDB services in multiple AWS regions.

With a solid foundation in place, Toyota Connected leveraged blue-green deployments and provisioned parallel services in Amazon EKS and integrated private AWS Direct Connect connections to its vehicle network. Upon successful validation of microservices on Amazon EKS, Toyota Connected gradually integrated them into its private DNS configuration, establishing a canary-like setup for traffic validation in AWS. Through a rolling restart process, traffic was transitioned to Amazon EKS. Once validated, Toyota Connected phased out the old Kubernetes clusters from the DNS configuration, fully transitioning to Amazon EKS clusters and regions.

The final step was to provision and migrate Redis to Amazon ElastiCache. Redis data was replicated to the target (AWS ElastiCache) via an EC2 instance created specifically to run the sync tool (RIOT-redis). A rolling restart was performed to gradually migrate to the new AWS Elasticache Redis nodes.

Post-migration—final architecture

Figure 3. TMC’s new architecture on AWSFigure 3. Toyota Connected’s new architecture on AWS

Here are details of the solutions selected by Toyota Connected, for key components of the architecture.

1. Database: Toyota Connected selected MongoDB Atlas on AWS to help minimize change during migration. MongoDB Atlas running on AWS offers automated provisioning, managed backup and restore, monitoring, scalability, security, global clusters across AWS Regions, and integration with AWS services. It also provides cost optimization tools and compliance features, facilitating easy deployment and maintenance of MongoDB databases for modern applications.

2. Microservices: For hosting, Toyota Connected selected Amazon Elastic Kubernetes Service (Amazon EKS), which is a managed service, to run Kubernetes in the AWS cloud and on-premises data centers. This decision was driven by in-house expertise, cost considerations, and the desire for granular control. Furthermore, using managed Amazon EKS helps reduce operational costs and simplifies security patching, enhancing the overall efficiency and security.

3. Network/Domain Name System (DNS): Toyota Connected selected self-hosted CoreDNS on Amazon EKS. To address the high volume of DNS requests generated by millions of vehicles on the road, Toyota Connected sought a solution that was scalable, fast, and cost-effective.

4. Message bus: Toyota Connected selected CloudAMQP, a third-party service that provides a managed service experience, while retaining all the configurability and plugin support of running RabbitMQ.

5. Network connectivity: Toyota Connected selected AWS Direct Connect, which creates a dedicated network connection to AWS, to establish a dedicated connection to the vehicle network, verifying redundant and reliable connectivity.

6. Redis: Toyota Connected selected Amazon ElastiCache Global Datastore — a service that provides fully managed, fast, reliable, and secure cross-region replication. Global Datastore supports cross-region replication latency of typically under 1 second, helping increase the responsiveness of your applications by providing geo-local reads closer to end users. Additionally, the simplified management features, such as automated software patches and cost optimization capabilities, make it an efficient and scalable solution for Toyota Connected’s distributed caching needs.

Resiliency

Toyota Connected leveraged Multi-Availability Zone (AZ), Multi-region resiliency capabilities within AWS to deliver improved system uptime.

  • Amazon EKS: Toyota Connected used automatic detection and replacement of unhealthy control plane nodes, coupled with patching of control plane, for high availability and security. It is deployed across regions.
  • CloudAMQP: provides robust support for multiple Availability Zone deployments, enhancing fault tolerance and availability. This will be supplemented with a multi-region deployment for resiliency.
  • MongoDB: offers multi-region support, enabling data distribution across multiple geographic locations for improved redundancy and performance.
  • Amazon ElastiCache Global Datastore: provides multi-region support, facilitating data replication and availability across AWS regions.
  • AWS Direct Connect: Drivelink is deployed across two regions, each connected with AWS Direct Connect.

Cost optimization

Toyota Connected is committed to continually optimizing its infrastructure costs. For further cost savings, Toyota Connected plans to explore the potential benefits of using AWS Fargate, a serverless, pay-as-you-go compute engine, along with implementing savings plans. In addition, Toyota Connected is looking at AWS Graviton processors, a family of processors designed to help deliver the best price performance for cloud workloads running in Amazon EC2. Toyota Connected also built dashboards in Amazon QuickSight to provide a single pane of glass for exploring costs and other relevant metrics, facilitating better visibility and informed decision-making in its cost optimization efforts.

Conclusion

In this post, we discussed how to leverage various mechanisms to help achieve near-zero downtime migration to AWS and how Toyota Connected leveraged those mechanisms to migrate the Toyota and Lexus Safety Connect application to AWS with improved performance and availability while minimizing impact to customer experience. For guidance on how to accelerate your migrations to AWS, visit the AWS for automotive page, or contact your AWS team today.

Sandeep Kulkarni

Sandeep Kulkarni

Sandeep Kulkarni is principal technologist in the automotive and manufacturing domain at AWS. His passion is to drive innovation, accelerate digital transformation, and build highly scalable and cost-effective solutions in the cloud for customers. His areas of expertise include connected vehicles, connected factories, supply chains, and financial services. In his spare time, he mentors high school students on leadership, practices yoga, learns Indian classical music, and enjoys gardening. He earned a master’s degree in finance from Boston College.

Kevin O'Dell

Kevin O'Dell

Kevin O'Dell is the director of engineering at Toyota Connected for Drivelink, the telematics platform that powers Toyota and Lexus Safety Connect for vehicles in North America. Drivelink is a critical feature that connects drivers to first responders to help them during emergencies. O’Dell’s team is responsible for cloud engineering that lets them maintain platform uptime in addition to updating the system with new features to improve customer experiences. Kevin earned his Bachelor of Science in business administration with a focus on information systems from Illinois State University, where he also played NCAA baseball for the Redbirds.

Nate Marshall

Nate Marshall

Nate Marshall is a managing engineer at Toyota Connected. He is deeply passionate about his team’s mission of providing services that prioritize the safety of the Toyota and Lexus customers and guests. He thrives on the challenge of building redundant and fault-tolerant architectures, all while maintaining cost-effectiveness and verifying that solutions remain simple and scalable. Outside of work, he finds joy in spending quality time with his family and delving into home automation projects using tools like Home Assistant, where he explores the intersection of technology and everyday life.

Rob Boetticher

Rob Boetticher

Rob Boetticher is the director of technology for the AWS Automotive and Manufacturing Industry. In this role, Rob leads the AWS Global Solutions Architecture and Customer Solutions teams responsible for supporting the world’s largest automotive OEMs and supplies to accelerate their digital transformation journeys. Prior to AWS, Rob held executive technical leadership roles in the cloud and networking industry. Rob holds a Master of Business Administration in finance from the NYU Stern School of Business and a bachelor’s degree in electrical engineering from Stevens Institute of Technology.