AWS Database Blog

Up your game: Increase player retention with ML-powered matchmaking using Amazon Aurora ML and Amazon SageMaker

Organizations are looking for ways to better leverage their data to improve their business operations. With Amazon Aurora, Aurora Machine Learning, and Amazon SageMaker, you can train machine learning (ML) services quickly and directly integrate the ML model with your existing Aurora data to better serve your customers. In this post, we demonstrate how a game publisher can adapt a player matchmaking system powered by Aurora to increase player retention using a real-time ML-based matchmaking model trained by Amazon SageMaker Autopilot. Although this example is for gaming, you could apply the same techniques in any industry where optimized matching of a customer to a cohort is valuable, such as retail or consumer products.

We first demonstrate a production-grade matchmaking system of the open-source racing game Supertuxkart. The matmaking system we built is powered by Aurora Serverless PostgreSQL v2. The matchmaking system manages the game sessions and allows matches between players on game servers that maximize player retention. Amazon Aurora PostgreSQL-Compatible Edition provides a SQL interface to the game data. We also show how we augment the data with Autopilot, which lets the game operator explore the database schema and prepare game data using SQL to build optimal models for player matchmaking without data science skills. Both models and databases are deployed in serverless mode, so we only pay when players play the game.

Why optimal matchmaking is a machine learning problem

Matchmaking systems for online multiplayer video games pair players with different game scenes to enhance the player experience and increase player retention. Matchmaking techniques currently implement static match policies based on game and player characteristics. However, player attributes don’t always match those of the game scenes (servers), causing the player to exit the game quicker than desired. For example, a player’s skill may match several race modes, such as capture the flag or grand prix with casual or experienced players.

It’s possible to build a match heuristic by aggregating game events from past game sessions. However, adding more attributes to a game complicates the maintenance of heuristics. Instead, we propose a simple and repeatable process that feeds the game dataset into Amazon SageMaker Data Wrangler, a codeless tool that looks for table columns (features) to use. Then Autopilot trains a model based on the data using hundreds of permutations and chooses the most appropriate algorithm and model configuration for maximizing the player retention metric. All with no data science skills needed.

Why SageMaker Autopilot?

Online game scenes are constantly updated, changing the player’s experience and, consequently, their retention. Therefore, game operators need to adjust their match policies to reflect the revised rules that emerge from the latest game scenes. The revised match policies require game operators to investigate game patterns and determine how they impact player retention. In this example, ML algorithms are used to create dynamic match policies based on current game scenes and player preferences. Our ML model also continuously learns from match patterns that are based on existing or novel game metrics. It trains several ML algorithms and suggests or automatically applies the optimal algorithm sequence for new match policies, thereby optimizing player retention in various game scenes.

Why Aurora ML?

Matchmaking systems keep an inventory of servers, players, and live sessions. Poor player experience results from joining an invalid game server or the wrong game scene. Matchmaking data needs to be consistent while providing data isolation for high concurrency. If we were to query a model outside of the database, you, the game developer, would have to pull player, server, and session information, batch it into chunks, and call the real-time inference endpoint. Finally, the game app aggregates the results and finds the optimal game server for the client.

Aurora ML allows you to skip all of that data moving and simply query servers, players, and sessions against the inference model via a SQL function that batches results and allows SQL-like aggregations in real-time, like max(), to select the optimal server from the model estimates. Your game app is also given minimal privileges between the database and the inference endpoint for high security.

Let’s play a game

In this post, we play the open-source racing game Supertuxkart and deploy player bots that want to play the game and search for suitable game servers. We built a matchmaking prototype that stores servers and session information in an Aurora Serverless PostgreSQL database.

This prototype uses a labeled dataset that contains the history of millions of game scenes (game servers) that players choose to play, the time they spend in each scene, and session length. To maximize session length, the prototype asks the model to estimate matches between available game scenes and player attributes, and then use the SQL function max().

The following screenshot shows an example of the simulated load. Each cart denoted by Bot number in the race map. The simulation will choose variety of maps and bot types.

We built a simulation of players who want to play throughout the day. The players present the attributes they want to see in the game. Additionally, each server starts with similar attributes (not identical), and the matchmaking system searches for the most suitable servers based on the players’ attributes. The following diagram describes the architecture deployed. EKS hosts the game servers and bot players that insert and update game sessions. SageMaker Autopilot consumes the session data we export to S3 for training the model.

We use the Supertuxkart game attributes such as game mode, game track theme, and difficulty. The system inserts player requests in the server_sessions table and routes players to the game lobby when no match is available. The system also logs every time it matches a server in the session table. The session length is updated after the client exits the game.

Use Autopilot to train the matchmaking model

We first export the labeled data to Amazon Simple Storage Service (Amazon S3) after setting up access to write to an S3 bucket from our Aurora database instance. See the following code:

SELECT aws_commons.create_s3_uri(
'stk-matchmaking',
'server_sessions_may_export',
'us-west-2'
) AS s3_uri_1 \gset
SELECT * FROM aws_s3.query_export_to_s3(
'SELECT s_location,s_track,s_tracktheme,s_mode,\
s_difficulty,p_location,p_track,p_tracktheme,\
p_mode,p_difficulty, p_skill,EXTRACT(MINUTE FROM session_length) AS session_length
FROM server_sessions
WHERE session_length IS NOT NULL AND p_skill IS NOT NULL',:'s3_uri_1', OPTIONS:='header true,format csv, delimiter $$,$$');

The features we may want to use are: s_location, s_track, s_tracktheme, s_mode, s_difficulty, p_location, p_track, p_tracktheme, p_mode, p_difficulty, and p_skill. We don’t know which feature is going to help the model predict the session length, so we’ll see how Autopilot can help us.

We create an Autopilot experiment with basic settings using the Autopilot console, as shown in the following screenshot.

Our training dataset file name is server_sessions_may_export, as listed when we set the S3 destination.

We use advanced settings with the default values first and let Autopilot figure out the best model. Autopilot supports three ML problems: binary classification, regression, and multiclass classification, and automatically detects the problem type. We want to rank the collection of sessions by session length. Therefore, it’s not a classification problem, so regression is our best option. We then choose Create Experiment and wait for the training to complete.

After the experiment is finished running, we can choose the experiment (right-click) and choose Describe AutoML Job.

The Autopilot job description shows the best model that yielded the minimal mean squared error (MSE).

Then we attempt to learn how to improve the model by looking at the model details to determine what contributed most to the model accuracy.

The model details on the Explainability tab show that the game mode (s_mode) and game track theme (s_tracktheme) contributed most to the model accuracy. It also shows that the player track (p_track) didn’t contribute much, so we can remove it in the next training.

Configure the SQL function for querying the model from Aurora

Now that we have the model deployed by Autopilot, we will configure the SQL function that queries the model from Aurora. The following SQL function defines the parameters needed for invoking the model we trained with Autopilot. It also references the endpoint name, stk-matchmaker16, in our case. We chose to return a VARCHAR type to allow flexibility in manipulating the model specifics. For example, XGBoost returns session_length integer, and the linear learner regression model returns the pair of session_length integer and accuracy float. In the SQL function, we show how we manipulate the model prediction results to retrieve the optimal server for the client.

CREATE EXTENSION IF NOT EXISTS aws_ml CASCADE;

CREATE FUNCTION estimate_session_length_xg(
in s_location CHAR(16),
in s_track CHAR(24),
in s_tracktheme CHAR(24),
in s_mode CHAR(24),
in s_difficulty INT8,
in p_difficulty INT8,
in p_location CHAR(16),
in p_track CHAR(24),
in p_tracktheme CHAR(24),
in p_mode CHAR(24),
in p_skill CHAR(24),
max_rows_per_batch INT DEFAULT NULL,
out estimate VARCHAR)
AS $$
select aws_sagemaker.invoke_endpoint('stk-matchmaker16',NULL,
s_location,s_track,s_tracktheme,s_mode,s_difficulty,
p_difficulty,p_location,p_track,p_tracktheme,p_mode,p_skill)::VARCHAR
$$ LANGUAGE SQL PARALLEL SAFE COST 5000;

Model evaluation: Query the model from Aurora using SQL

We test the model by analyzing it before embedding the query in the game client application. The following query retrieves the matches and the actual session length the bots played in the game simulation. We queried the model deployed by Autopilot by calling the estimate_session_length_xg Function. The query results show the correlation between predictions made by the model and the actual player time, so we’re ready to let the model match the game scene for us.

SELECT estimate_session_length_xg(
s_location,s_track,s_tracktheme,s_mode,s_difficulty,p_difficulty,p_location,p_track,p_tracktheme,p_mode,p_skill) as xg_predicted_session_length
EXTRACT(MINUTE FROM session_length) AS actual_session_length
FROM server_sessions where session_length IS NOT NULL
AND updated_at>NOW()-'10 hour'::INTERVAL and updated_at<NOW()-'5 hour'::INTERVAL
LIMIT 5;

xg_predicted_session_length | actual_session_length
-----------------------------+-----------------------+
39 | 24
38 | 29
12 | 7
14 | 7
12 | 9
(5 rows)

Next, we embed the matchmaking query in the client. The outer query returns the endpoint that expected to yield the longest session, namely, max(estimate_session_length_xg) of all estimations. The model endpoint stk-matchmaker16 is queried by the Aurora function, estimate_session_length_xg with the server attributes such as location, game track, game theme, game mode, and game difficulty from the available servers (is_ready=1) and the player attributes (player_prefix) to determine which server has the estimated longest session. See the following code:

endpoint=
`psql -A -q -t -w -c
"/*start-client.sh-test-model*/
select endpoint
from (
select endpoint,
max(estimate_session_length_xg(
location,track,tracktheme,mode,difficulty,
'$player_difficulty','$player_location','$player_track','$player_theme_track','$player_mode','$player_skill')::NUMERIC) as estimate
from servers
where is_ready=1
and max_players>num_active_session+"$NETWORK_AI"
and created_at>NOW()-'24 hour'::INTERVAL
group by endpoint
order by estimate
desc limit 1
) as t;"`

How Aurora ML helped

The chief metrics we use to assess the model performance are stksrv-ml and stksrv-noml. The following example shows that matches made by the model, denoted in blue (stksrv-ml), were 62% longer than the one made with no ML, denoted in red (stksrv-noml). We also provide the game client demand simulation by clients matched using the model inference (stkcli-ml) and without (stkcli-noml). The first figure (Average Session Length ml/noml) shows the session length of matches made by the model (ml) and without the model (noml). The second figure (Game demand simu ml/noml) illustrates the equal load between the two client types, ml and noml.

What’s next?

Keeping players in the game is a continuous process. Now that we have a model that helps keep players in the game, we want to watch the session length graph for changes that are driven by new game themes, karts, and scenes, which may require the model to be retrained (by exporting a fresh labeled dataset to Amazon S3). We then use Autopilot to train a revised model, and evaluate it in the same manner as we did with the first model.

How to use the code sample

The system includes a Kubernetes cluster setup that runs the game server, player bots, and the load simulator Docker containers. It also contains the setup of an Aurora database for storing the server and session information. Execute the following steps from your favorite development environment:

  1. Clone the Github code sample
    git clone https://github.com/aws-samples/amazon-aurora-call-to-amazon-sagemaker-sample.git
  2. Deploy the Kubernetes cluster with eksctl:
    eksctl create -f ./multiplayer-matchmaker/game-k8s-specs/eksctl-cluster.sh
  3. Set up your node (EC2 instances) cluster auto scaler using Karpenter. We used the following provisioner spec.
  4. Deploy the Aurora PostgreSQL cluster:
    cd ./multiplayer-matchmaker/aurora-pg-cdk
    npm install -g aws-cdk
    AWS_ACCOUNT_ID=`aws sts get-caller-identity --query Account --output text`
    AWS_REGION=us-west-2
    cdk bootstrap aws://$AWS_ACCOUNT_ID/$AWS_REGION
    cdk deploy
  5. Populate the database credentials for the game server and player bots. Pull the secrets from AWS Secrets Manager and populate game-k8s-specs/db-creds.secrets. For example, see the following code:
    cat db-creds.secrets
    PGHOST=myhost.rds.amazonaws.com
    PGUSER=myuser
    PGPASSWORD=mpass
    cd ./multiplayer-matchmaker/game-k8s-specs/
    ./db-creds-create.sh
  6. Deploy the game server AWS CDK to create the game server, game client, and the load simulator Docker images.
  7. Deploy the workload:
    kubectl apply -f game-k8s-specs/stk-server-match.yaml
    kubectl apply -f game-k8s-specs/stk-client-match.yaml
    kubectl apply -f game-k8s-specs/appsimulator.yaml
  8. Let the workload run for a few hours, and observe the session and server info populated in the database.
  9. Export the data to Amazon S3 and train the data using the steps described earlier.
  10. Deploy the client application that uses the ML function:
    kubectl apply -f game-k8s-specs/ stk-client-ml-match.yaml
  11. Observe the Amazon CloudWatch metrics in the CloudWatch namespaces appsimulator and supertuxkart.

Clean up

To avoid incurring charges for the services used, please run the following command:

./amazon-aurora-call-to-amazon-sagemaker-sample/multiplayer-matchmaker/clean.sh
The script will remove the application in EKS, then remove the EKS cluster and finally delete the Aurora DB cluster.

Conclusion

In this post, we showed you a fun game matchmaking example that augments your transactional data with ML algorithms directly from your Aurora database using Aurora ML without adding an extra application layer. Additionally, we demonstrated how to use Autopilot to train an optimized model to estimate matches between game scenes and players. We then maximized player retention with a SQL Max function and simplified the game match by removing the need for setting rules for static matches that require modification as the game evolves.

Databases and ML training and hosting solutions are both serverless, so you only pay for what your players use—there’s no management involved. To emphasize the effectiveness and simplicity of Aurora ML and Autopilot, we instrumented the duration of games with and without machine learning using CloudWatch.

We’re eager to hear from you about similar challenges you experienced with your Aurora database. Leave a comment in the comments section or create an issue in the multiplayer matchmaking simulator code sample.


About the authors

Yahav Biran is a Principal Solutions Architect in AWS, focused on game tech at scale. Yahav enjoys contributing to open-source projects and publishes in the AWS blog and academic journals. He currently contributes to the K8s Helm community, AWS databases and compute blogs, and Journal of Systems Engineering. He delivers technical presentations at technology events and works with customers to design their applications in the cloud. He received his PhD (Systems Engineering) from Colorado State University.

Mani Khanuja is an Artificial Intelligence and Machine Learning Specialist SA at Amazon Web Services (AWS). She helps customers use machine learning to solve their business challenges with AWS. She spends most of her time diving deep and teaching customers on AI/ML projects related to computer vision, natural language processing, forecasting, ML at the edge, and more. She is passionate about ML at the edge. She has created her own lab with a self-driving kit and prototype manufacturing production line, where she spends a lot of her free time.

Yuval Dovrat is leading a team of Solutions Architects at AWS, focusing on the enterprise gaming segment. Prior to that, Yuval led the AMER Containers and Serverless Specialist SA team for AWS. Before joining AWS, Yuval led the Solutions Architecture org at Spot.IO, and managed devops team in various AdTech companies in the NYC metro area. Beside his love of Kubernetes and old-school video games, Yuval also enjoys playing the Bass guitar and listening to Punk music.