AWS Big Data Blog
Category: Learning Levels
Run queries concurrently and see query history using Amazon Redshift Query Editor v2
Amazon Redshift is a fast, fully managed, petabyte-scale cloud data warehouse. You have the flexibility to choose from provisioned and serverless compute modes. You can start loading and querying large datasets conveniently in Amazon Redshift using Amazon Redshift Query Editor v2, a web-based SQL client application. Query Editor v2 empowers your technical and business teams […]
Lower your Amazon OpenSearch Service storage cost with gp3 Amazon EBS volumes
Amazon OpenSearch Service makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more. OpenSearch is an open-source, distributed search and analytics suite comprising OpenSearch, a distributed search and analytics engine, and OpenSearch Dashboards, a UI and visualization tool. When you use Amazon OpenSearch Service, you configure a set […]
How Etleap and Amazon Redshift Serverless optimize costs for ETL
Amazon Redshift Serverless lets you avoid managing infrastructure while only paying for what you use. Etleap provides data integration software that is natively built on AWS. It’s an AWS Advanced Technology Partner with the AWS Data & Analytics Competency and Amazon Redshift Service Ready designation. In this post, we share how you can minimize the […]
Automate your Amazon QuickSight deployment with the new API-based account creation and deletion
Amazon QuickSight is a fully managed, cloud-native business intelligence (BI) service that makes it easy to connect to your data, create interactive dashboards, and share these with tens of thousands of users, either within the QuickSight interface, or embedded in software as a service (SaaS) applications or web portals. We’re excited to announce the availability […]
How GoDaddy built a data mesh to decentralize data ownership using AWS Lake Formation
This is a guest post co-written with Ankit Jhalaria from GoDaddy. GoDaddy is empowering everyday entrepreneurs by providing all the help and tools to succeed online. With more than 20 million customers worldwide, GoDaddy is the place people come to name their idea, build a professional website, attract customers, and manage their work. GoDaddy is […]
Get started with data integration from Amazon S3 to Amazon Redshift using AWS Glue interactive sessions
Organizations are placing a high priority on data integration, especially to support analytics, machine learning (ML), business intelligence (BI), and application development initiatives. Data is growing exponentially and is generated by increasingly diverse data sources. Data integration becomes challenging when processing data at scale and the inherent heavy lifting associated with infrastructure required to manage […]
Announcing AWS Glue crawler support for Snowflake
For data lake customers who need to discover petabytes of data, AWS Glue crawlers are a popular way to scan data in the background, so you can focus on using the data to make better intelligent decisions. You may also have data in data warehouses such as Snowflake and want the ability to discover the […]
Use Karpenter to speed up Amazon EMR on EKS autoscaling
Amazon EMR on Amazon EKS is a deployment option for Amazon EMR that allows organizations to run Apache Spark on Amazon Elastic Kubernetes Service (Amazon EKS). With EMR on EKS, the Spark jobs run on the Amazon EMR runtime for Apache Spark. This increases the performance of your Spark jobs so that they run faster […]
Use an event-driven architecture to build a data mesh on AWS
In this post, we take the data mesh design discussed in Design a data mesh architecture using AWS Lake Formation and AWS Glue, and demonstrate how to initialize data domain accounts to enable managed sharing; we also go through how we can use an event-driven approach to automate processes between the central governance account and […]
How Hudl built a cost-optimized AWS Glue pipeline with Apache Hudi datasets
This is a guest blog post co-written with Addison Higley and Ramzi Yassine from Hudl. Hudl Agile Sports Technologies, Inc. is a Lincoln, Nebraska based company that provides tools for coaches and athletes to review game footage and improve individual and team play. Its initial product line served college and professional American football teams. Today, […]