Guidance for Near Real-Time Fraud Detection with Graph Neural Network on AWS

This Guidance demonstrates an end-to-end, near real-time anti-fraud system based on deep learning graph neural networks. This blueprint architecture uses Deep Graph Library (DGL) to construct a heterogeneous graph from tabular data and train a Graph Neural Network (GNN) model to detect fraudulent transactions.

Architecture Diagram

Download the architecture diagram PDF

Near Real-Time Fraud Detection
Offline Model Training

Near Real-Time Fraud Detection
Step 1
Use Amazon API Gateway to host HTTP APIs for near real-time fraud detection services.

Step 2
Use AWS Lambda functions as an HTTP API backend. The functions process the new transactions as graph data then store them in a graph database such as Amazon Neptune.

Step 3
Query the sub-graph of the requested transactions from Neptune.

Step 4
Use an Amazon SageMaker endpoint to predict the fraudulent possibility of transactions with pre-trained GNN models.

Step 5
Send the predicated results to Amazon Simple Queue Service (Amazon SQS) to be consumed by business analysis systems.

Step 6
Use Lambda functions to poll the predicated results from Amazon SQS, then store them in Amazon DocumentDB.

Step 7
Business analysts access the business dashboard, which uses Amazon CloudFront and Amazon Simple Storage Service (Amazon S3) to host a static website, and AWS AppSync and Lambda as a backend.

Step 8
Use Lambda functions as an AWS AppSync resolver to fetch the data from Amazon DocumentDB.

Step 9
CloudFront uses origin access identity (OAI) to securely access the static web files on Amazon S3.

Click to enlarge

Step 1
Use Amazon API Gateway to host HTTP APIs for near real-time fraud detection services.

Step 2
Use AWS Lambda functions as an HTTP API backend. The functions process the new transactions as graph data then store them in a graph database such as Amazon Neptune.

Step 3
Query the sub-graph of the requested transactions from Neptune.

Step 4
Use an Amazon SageMaker endpoint to predict the fraudulent possibility of transactions with pre-trained GNN models.

Step 5
Send the predicated results to Amazon Simple Queue Service (Amazon SQS) to be consumed by business analysis systems.

Step 6
Use Lambda functions to poll the predicated results from Amazon SQS, then store them in Amazon DocumentDB.

Step 7
Business analysts access the business dashboard, which uses Amazon CloudFront and Amazon Simple Storage Service (Amazon S3) to host a static website, and AWS AppSync and Lambda as a backend.

Step 8
Use Lambda functions as an AWS AppSync resolver to fetch the data from Amazon DocumentDB.

Step 9
CloudFront uses origin access identity (OAI) to securely access the static web files on Amazon S3.
Offline Model Training
Step 1
System operations or a periodic system task initiates the model training workflow.

Step 2
Use Lambda function to ingest the raw dataset to Amazon S3.

Step 3
Query the sub-graph of the requested transactions from Neptune. Use AWS Glue crawler to crawl the raw dataset to populate the Data Catalog.

Step 4
Use AWS Glue extract, transform, load (ETL) job to transform the tabular dataset to a heterogeneous graph dataset, then save it to Amazon S3.

Step 5
Use the SageMaker training job to train the Graph Neural Network (GNN)-based fraud detection model with Deep Graph Library (DGL).

Step 6
Use AWS Fargate with Amazon Elastic Container Service (Amazon ECS) to load the graph dataset from Amazon S3 into fully-managed graph database service, Neptune.

Step 7
Use Lambda to package the GNN model and custom code as the model in SageMaker.

Step 8
Create an endpoint configuration of SageMaker.

Step 9
Create or update an endpoint using the endpoint configuration in Sagemaker.

Click to enlarge

Step 1
System operations or a periodic system task initiates the model training workflow.

Step 2
Use Lambda function to ingest the raw dataset to Amazon S3.

Step 3
Query the sub-graph of the requested transactions from Neptune. Use AWS Glue crawler to crawl the raw dataset to populate the Data Catalog.

Step 4
Use AWS Glue extract, transform, load (ETL) job to transform the tabular dataset to a heterogeneous graph dataset, then save it to Amazon S3.

Step 5
Use the SageMaker training job to train the Graph Neural Network (GNN)-based fraud detection model with Deep Graph Library (DGL).

Step 6
Use AWS Fargate with Amazon Elastic Container Service (Amazon ECS) to load the graph dataset from Amazon S3 into fully-managed graph database service, Neptune.

Step 7
Use Lambda to package the GNN model and custom code as the model in SageMaker.

Step 8
Create an endpoint configuration of SageMaker.

Step 9
Create or update an endpoint using the endpoint configuration in Sagemaker.

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Operational Excellence

This Guidance uses AWS Serverless services like AWS Glue, SageMaker, AWS Fargate, Lambda as compute resources for processing data, training models, serving the API functionalities, and keeping billing to pay-as-you-go pricing. One of the data stores is designed using Amazon S3, providing a low total cost of ownership for storing and retrieving data. The business dashboard uses CloudFront, Amazon S3 and AWS AppSync, Lambda to implement the web application.

Read the Operational Excellence whitepaper
Security

API Gateway and Lambda provide a protection layer when invoking Lambda functions through an outbound API. All the proposed services support integration with AWS Identity and Access Management (IAM), which can be used to control access to resources and data. All traffic in the VPC between services are controlled by security groups.

Read the Security whitepaper
Reliability

API Gateway, Lambda, AWS Step Functions, AWS Glue, Amazon S3, Neptune, Amazon DocumentDB, and AWS AppSync provide high availability within a Region. Customers can deploy SageMaker endpoints in a highly available manner.

Read the Reliability whitepaper
Performance Efficiency

All the services used in the design provide cloud watch metrics that can be used to monitor individual components of the design. MLOps pipelines orchestrated by Step Functions helps to continuously iterate the model. API Gateway and Lambda allow publishing of new versions through an automated pipeline.

Read the Performance Efficiency whitepaper
Cost Optimization

This Guidance requires GNN model training for fraud detection. The performance requirements for batch processing range from minutes to hours; AWS Glue and SageMaker training jobs are designed to meet them. Neptune is a purpose-built, high-performance graph database engine. Neptune efficiently stores and navigates graph data, and uses a scale-up, in-memory optimized architecture for fast query evaluation over large graphs. Provisioned concurrency in Lambda and the HTTP API in API Gateway can support a latency requirement of less than 10 ms.

Read the Cost Optimization whitepaper
Sustainability

This Guidance uses the scaling behaviors of Lambda and API Gateway to reduce over-provisioning resources. It uses AWS Managed Services to maximize resource utilization and to reduce the amount of energy needed to run a given workload.

Read the Sustainability whitepaper

Implementation Resources

A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.

The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.

Open implementation guide

Open sample code on GitHub

Select your cookie preferences

Architecture Diagram

Well-Architected Pillars

Implementation Resources

Related Content

Build a GNN-based real-time fraud detection solution using Amazon SageMaker, Amazon Neptune, and the Deep Graph Library

Disclaimer

Was this page helpful?

Select your cookie preferences

Guidance for Near Real-Time Fraud Detection with Graph Neural Network on AWS

Architecture Diagram

Well-Architected Pillars

Implementation Resources

Related Content

Build a GNN-based real-time fraud detection solution using Amazon SageMaker, Amazon Neptune, and the Deep Graph Library

Disclaimer

Was this page helpful?

Ending Support for Internet Explorer