AWS Partner Network (APN) Blog
Tag: Amazon EMR
How to Build and Deploy Amazon SageMaker Models in Dataiku Collaboratively
Organizations often need business analysts and citizen data scientists to work with data scientists to create machine learning (ML) models, but they struggle to provide a common ground for collaboration. Newly enriched Dataiku Data Science Studio (DSS) and Amazon SageMaker capabilities answer this need, empowering a broader set of users by leveraging the managed infrastructure of Amazon SageMaker and combining it with Dataiku’s visual interface to develop models at scale.
How Thundra Decreased Data Processing Pipeline Delay By 3x on Average and 6x on P99
It can be complicated to maintain a robust, scalable, and reliable monitoring system that inputs terabytes of data under heavy traffic. Learn how Thundra has delivered 99.9 percent availability to customers since incorporating AWS services into its product. Thundra’s platform can handle scalability and availability challenges, recover both from partial failures and major outages, and support point-in-time recovery in case of disaster.
WANdisco Accelerates GoDaddy’s Hadoop Cloud Migration to AWS Without Business Interruption
Advances in migration technology enable you to migrate data from actively-used Hadoop environments at scale to the cloud. You can benefit from AWS managed services, pace of innovation, and improve costs for your largest and most complex analytic workloads. Learn how GoDaddy migrated data from their 800-node, 2.5 PB production Apache Hadoop cluster to Amazon S3 using WANdisco’s LiveData Migrator product.
Rapid Data Lake Development with Data Lake as Code Using AWS CloudFormation
Data lakes have evolved into the single store-platform for all enterprise data managed. On AWS, an integrated set of services are available to engineer and automate data lakes. A data lake on AWS is able to group all of the previously mentioned services of relational and non-relational data and allow you to query results faster and at a lower cost. Learn how nClouds used code automation via AWS CloudFormation to create a dynamic data lake stack to visualize and analyze the financial market data.
Maintaining Control of PII Hosted on AWS with Hold Your Own Key (HYOK) Security
One of the biggest challenges in moving to the cloud for organizations that collect and process personally identifiable information (PII) is the fundamental change to the trust model. SecuPi minimizes changes to the trust model and reduces the risk associated with digital transformations. Learn how SecuPi can help you collect and process sensitive or regulated PII and reduce barriers to cloud adoption while satisfying the trust model requirements of even the most conservative and risk-averse companies.
How nClouds Helps Accelerate Data Delivery with Apache Hudi on Amazon EMR
Apache Hudi on Amazon EMR is an ideal solution for large-scale and near real-time applications that require incremental data pipelines and processing. This post provides a step-by-step method to perform a proof of concept for Apache Hudi on Amazon EMR. Learn how a non-customer-facing PoC solution from nClouds set up a new data and analytics platform using Apache Hudi on Amazon EMR and other managed services, including Amazon QuickSight for data visualization.
How SnapLogic eXtreme Helps Visualize Spark ETL Pipelines on Amazon EMR
Fully managed cloud services enable global enterprises to focus on strategic differentiators versus maintaining infrastructure. They do this by creating data lakes and performing big data processing in the cloud. SnapLogic eXtreme allows citizen integrators, those who can’t code, and data integrators to efficiently support and augment data-integration use cases by performing complex transformations on large volumes of data. Learn how to set up SnapLogic eXtreme and use Amazon EMR to do Amazon Redshift ETL.
Implementing SAML AuthN for Amazon EMR Using Okta and Column-Level AuthZ with AWS Lake Formation
As organizations continue to build data lakes on AWS and adopt Amazon EMR, especially when consuming data at enterprise scale, it’s critical to govern your data lakes by establishing federated access and having fine-grained controls to access your data. Learn how to implement SAML-based authentication (AuthN) using Okta for Amazon EMR, querying data using Zeppelin notebooks, and applying column-level authorization (AuthZ) using AWS Lake Formation.
Say Hello to 87 New AWS Competency, Service Delivery, Service Ready, and MSP Partners Added in July
We are excited to highlight 87 APN Partners that received new designations in July for our global AWS Competency, AWS Managed Service Provider (MSP), AWS Service Delivery, and AWS Service Ready programs. These designations span workload, solution, and industry, and help AWS customers identify top APN Partners that can deliver on core business objectives. APN Partners are focused on your success, helping customers take full advantage of the business benefits AWS has to offer.
New-Look AWS Service Delivery Validation Checklists for APN Consulting Partners
To receive the AWS Service Delivery designation, organizations must undergo rigorous technical validation. They are also assessed on the security, performance, and reliability of their AWS solutions. To help APN Consulting Partners better understand this process and our validation requirements, we are releasing new versions of the AWS Service Delivery Validation Checklists. These outline for the customer case study and technical criteria needed to achieve the AWS Service Delivery designation.