Duration
1 Day
Level
Intermediate Level
Design and Tailor this course
As per your team needs
Edit Content
This course is designed to cover basic architecture of cloud computing followed by discussing and comparing various services provided by AWS to build a big data pipeline. Finally we will conclude this course by running an actual Big data pipeline using a subset of AWS services.
Edit Content
This course is designed for Data Engineers and Architects who want to understand how to build a modern data lake and data warehouse.
Edit Content
- Capabilities of AWS w.r.t. Data Engineering
- DataLake vs Datawarehouse
- Serverless vs Managed Services
- Comparing various Ingestion services in AWS – (Kinesis/Kafka)
- Comparing various processing services in AWS – (Glue /Spark/Lambda)
- Comparing storages in AWS – (DynamoDB/DocumentDB/Amazon Keyspaces)
- Comparing various analytics services in AWS – (Redshift/Athena/Snowflake)
- Kinesis – collect data from various sources
- Amazon S3 – Build DataLake
- Amazon EMR – Run Spark for Data processing
- Amazon Glue – Creating Data Catalog on S3
- Amazon Athena – Running Analytics
- Aws Lambda – Creating Serverless functions to process data
- DynamoDB – Storing processed data in nosql database
Edit Content
Basic knowledge of AWS
- For experiencing the hands-on during the class, it is recommended that each Participant should have an AWS account, Snowflake account and access to Talend Cloud Data Integration