AWS Architecture Blog
Category: AWS Glue
Top Architecture Blog Posts of 2023
2023 was a rollercoaster year in tech, and we at the AWS Architecture Blog feel so fortunate to have shared in the excitement. As we move into 2024 and all of the new technologies we could see, we want to take a moment to highlight the brightest stars from 2023. As always, thanks to our […]
Use a reusable ETL framework in your AWS lake house architecture
Data lakes and lake house architectures have become an integral part of a data platform for any organization. However, you may face multiple challenges while developing a lake house platform and integrating with various source systems. In this blog, we will address these challenges and show how our framework can help mitigate these issues. Lake […]
Reduce archive cost with serverless data archiving
For regulatory reasons, decommissioning core business systems in financial services and insurance (FSI) markets requires data to remain accessible years after the application is retired. Traditionally, FSI companies either outsourced data archiving to third-party service providers, which maintained application replicas, or purchased vendor software to query and visualize archival data. In this blog post, we […]
Managing data confidentiality for Scope 3 emissions using AWS Clean Rooms
Scope 3 emissions are indirect greenhouse gas emissions that are a result of a company’s activities, but occur outside the company’s direct control or ownership. Measuring these emissions requires collecting data from a wide range of external sources, like raw material suppliers, transportation providers, and other third parties. One of the main challenges with Scope […]
Text analytics on AWS: implementing a data lake architecture with OpenSearch
Text data is a common type of unstructured data found in analytics. It is often stored without a predefined format and can be hard to obtain and process. For example, web pages contain text data that data analysts collect through web scraping and pre-process using lowercasing, stemming, and lemmatization. After pre-processing, the cleaned text is […]
Optimize your modern data architecture for sustainability: Part 2 – unified data governance, data movement, and purpose-built analytics
In the first part of this blog series, Optimize your modern data architecture for sustainability: Part 1 – data ingestion and data lake, we focused on the 1) data ingestion, and 2) data lake pillars of the modern data architecture. In this blog post, we will provide guidance and best practices to optimize the components […]
Optimize your modern data architecture for sustainability: Part 1 – data ingestion and data lake
The modern data architecture on AWS focuses on integrating a data lake and purpose-built data services to efficiently build analytics workloads, which provide speed and agility at scale. Using the right service for the right purpose not only provides performance gains, but facilitates the right utilization of resources. Review Modern Data Analytics Reference Architecture on […]
Let’s Architect! Modern data architectures
With the rapid growth in data coming from data platforms and applications, and the continuous improvements in state-of-the-art machine learning algorithms, data are becoming key assets for companies. Modern data architectures include data mesh—a recent style that represents a paradigm shift, in which data is treated as a product and data architectures are designed around […]
Data warehouse and business intelligence technology consolidation using AWS
Organizations have been using data warehouse and business intelligence (DWBI) workloads to support business decision making for many years. These workloads are brought to the Amazon Web Services (AWS) platform to utilize the benefit of AWS cloud. However, these workloads are built using multiple vendor tools and technologies, and the customer faces the burden of […]
Insights for CTOs: Part 3 – Growing your business with modern data capabilities
This post was co-wrtiten with Jonathan Hwang, head of Foundation Data Analytics at Zendesk. In my role as a Senior Solutions Architect, I have spoken to chief technology officers (CTOs) and executive leadership of large enterprises like big banks, software as a service (SaaS) businesses, mid-sized enterprises, and startups. In this 6-part series, I share […]