This Guidance shows how payment service providers can implement a near real-time fraud screening system on AWS by streaming data. Transactions are scored by risk using machine learning (ML) models, and notifications are sent to customers based on the risk level of the transactions.
Architecture Diagram
Step 1
Large amounts of customer data stored in on-premises databases; file systems, and long-term historical data on mainframes is moved into Amazon Simple Storage Service (Amazon S3) using various data transfer services such as Amazon EMR, AWS Database Migration Service (AWS DMS), AWS DataSync, and Amazon Kinesis Data Streams.
Step 2
Configure AWS Glue to initiate your extract, transform, load (ETL) jobs to run as soon as new data becomes available in Amazon S3.
Step 3
Amazon Athena makes it easy to analyze data directly in Amazon S3 using standard SQL.
Step 4
Near real-time transactions are sent to Amazon Kinesis Data Streams. AWS Lambda integrates natively with Amazon Kinesis as a consumer to process data ingested through a data stream.
Step 5
Multiple Lambda functions is invoked from a single Amazon API Gateway for different kinds of inference.
Step 6
An Amazon SageMaker notebook instance with different machine learning (ML) models that will be trained on the dataset gives a prediction score to the endpoint.
Step 7
The fraud ring and profile analytics in near real-time that was queried through Amazon Athena is persisted in Amazon DynamoDB.
Step 8
The final aggregated score is calculated based on inferences and a notification is sent to an end user in the event of fraud through Amazon Pinpoint.
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
This Guidance shows how fully managed services such as AWS DataSync, Amazon EMR, and Kinesis allow you to break free from the complexities of database and data warehouse administration.
You can send logs directly from your application to CloudWatch using the CloudWatch Logs API, or send events using an AWS SDK and Amazon EventBridge.
-
Security
Raw data is ingested into Amazon S3. Amazon S3 supports both server-side encryption and client-side encryption for data uploads.
You can encrypt metadata objects in your AWS Glue Data Catalog in addition to the data written to Amazon S3 and Amazon CloudWatch Logs by jobs, crawlers, and development endpoints.
-
Reliability
The solution is modular and has the ability to scale based on the transactions. Serverless capabilities such as Kinesis and Lambda automatically scale throughput up or down based on demand.
-
Performance Efficiency
Serverless architectures help to provision the exact resources that the workload needs. Lambda manages scaling automatically. You can optimize the individual Lambda functions used in your application to reduce latency and increase throughput.
-
Cost Optimization
This Guidance is designed to be fully optimized for cost, only using resources where necessary and only accessing data using the services appropriate for the business need.
All costs should align with the defined goals for pricing and clearly defined KPIs for managing batch, compared with near real time requirements to ensure the optimum value benefits.
-
Sustainability
By extensively using managed services and dynamic scaling, you minimize the environmental impact of the backend services.
Technologies that support data access and storage patterns should be monitored to ensure that assets such as data are stored in the optimum solution based on the read and write access patterns, paying close attention to the scaling of compute resources closely aligned to the demand.
Implementation Resources
A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Related Content
[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.