AWS Big Data Blog
Category: Learning Levels
Amazon OpenSearch Service Under the Hood: Multi-AZ with Standby
Amazon OpenSearch Service recently announced Multi-AZ with Standby, a new deployment option for managed clusters that enables 99.99% availability and consistent performance for business-critical workloads. With Multi-AZ with Standby, clusters are resilient to infrastructure failures like hardware or networking failure. This option provides improved reliability and the added benefit of simplifying cluster configuration and management […]
Perform secure database write-backs with Amazon QuickSight
Amazon QuickSight is a scalable, serverless, machine learning (ML)-powered business intelligence (BI) solution that makes it easy to connect to your data, create interactive dashboards, get access to ML-enabled insights, and share visuals and dashboards with tens of thousands of internal and external users, either within QuickSight itself or embedded into any application. A write-back […]
Ten new visual transforms in AWS Glue Studio
AWS Glue Studio is a graphical interface that makes it easy to create, run, and monitor extract, transform, and load (ETL) jobs in AWS Glue. It allows you to visually compose data transformation workflows using nodes that represent different data handling steps, which later are converted automatically into code to run. AWS Glue Studio recently […]
Use SAML Identities for programmatic access to Amazon OpenSearch Service
Customers of Amazon OpenSearch Service can already use Security Assertion Markup Language (SAML) to access OpenSearch Dashboards. This post outlines two methods by which programmatic users can now access OpenSearch using SAML identities. This applies to all identity providers (IdPs) that support SAML 2.0, including prevalent ones like Active Directory Federation Service (ADFS), Okta, AWS […]
Scale your AWS Glue for Apache Spark jobs with new larger worker types G.4X and G.8X
Hundreds of thousands of customers use AWS Glue, a serverless data integration service, to discover, prepare, and combine data for analytics, machine learning (ML), and application development. AWS Glue for Apache Spark jobs work with your code and configuration of the number of data processing units (DPU). Each DPU provides 4 vCPU, 16 GB memory, […]
New scatter plot options in Amazon QuickSight to visualize your data
Are you looking to understand the relationships between two numerical variables? Scatter plots are a powerful visual type that allow you to identify patterns, outliers, and strength of relationships between variables. In this post, we walk you through the newly launched scatter plot features in Amazon QuickSight, which will help you take your correlation analysis […]
How Encored Technologies built serverless event-driven data pipelines with AWS
This post is a guest post co-written with SeonJeong Lee, JaeRyun Yim, and HyeonSeok Yang from Encored Technologies. Encored Technologies (Encored) is an energy IT company in Korea that helps their customers generate higher revenue and reduce operational costs in renewable energy industries by providing various AI-based solutions. Encored develops machine learning (ML) applications predicting […]
Build efficient, cross-Regional, I/O-intensive workloads with Dask on AWS
Welcome to the era of data. The sheer volume of data captured daily continues to grow, calling for platforms and solutions to evolve. Services such as Amazon Simple Storage Service (Amazon S3) offer a scalable solution that adapts yet remains cost-effective for growing datasets. The Amazon Sustainability Data Initiative (ASDI) uses the capabilities of Amazon […]
Improve reliability and reduce costs of your Apache Spark workloads with vertical autoscaling on Amazon EMR on EKS
Amazon EMR on Amazon EKS is a deployment option offered by Amazon EMR that enables you to run Apache Spark applications on Amazon Elastic Kubernetes Service (Amazon EKS) in a cost-effective manner. It uses the EMR runtime for Apache Spark to increase performance so that your jobs run faster and cost less. Apache Spark allows […]
Process price transparency data using AWS Glue
The Transparency in Coverage rule is a federal regulation in the United States that was finalized by the Center for Medicare and Medicaid Services (CMS) in October 2020. The rule requires health insurers to provide clear and concise information to consumers about their health plan benefits, including costs and coverage details. Under the rule, health […]