ETL(Extract, Transform, and Load), Analyze and Visualize a Data Lake Using AWS Glue, Amazon Athena, Amazon Quicksight, and Amazon S3

Cumhur Akkaya
15 min readJul 20, 2024

Now let’s imagine, that we are a retail company that is looking to improve data management and analyze sales data from multiple databases, other sources, and different locations. We want to combine data bases into a single repository. Thus, this unified data repository allows us for simplified access for analysis and additional processing. This is important to us because we need to adjust our sales strategies and resources according to the results we find. To accomplish this mission related to big data, we can use AWS Glue as an ETL and data catalog management tool, Amazon Athena as a data query tool, Amazon QuickSight as a visualization tool, and Amazon S3 data lake as a data lake storage tool, as shown in the figure below.
In this article, we will upload a sample of raw data to S3 Bucket, then we will Extract, Transform, and Load this data using AWS Glue.
Then, we will analyze and query this data in S3 Bucket or AWS Glue Catalog Data by using Amazon Athena. We will save the results as pdf, and we will also automatically save them to a folder we specify in the S3 bucket.
Finally, we will visualize our data queries using Amazon QuickSight. Thus it will give decision-makers the opportunity to explore and interpret information in an interactive visual environment.
We will use them practically step by step in this article.

--

--

Cumhur Akkaya

✦ Multi-Cloud & DevOps Engineer, ✦Technical Writer, ✦AWS Community Builder, ✦LinkedInTop Voice, ✦Believes in learning by doing, ✦Linkedin: linkedin.com/in/cumh