Data Engineering on AWS

Entering the realm of Data for Engineering the Valuable Insights

Duration

1 Day

Level

Intermediate Level

Design and Tailor this course

As per your team needs

Edit Content

This course is designed to cover basic architecture of cloud computing followed by discussing and comparing various services provided by AWS to build a big data pipeline. Finally we will conclude this course by running an actual Big data pipeline using a subset of AWS services.

Edit Content

This course is designed for Data Engineers and Architects who want to understand how to build a modern data lake and data warehouse.

Edit Content
  • Capabilities of AWS w.r.t. Data Engineering
  • DataLake vs Datawarehouse
  • Serverless vs Managed Services
  • Comparing various Ingestion services in AWS – (Kinesis/Kafka)
  • Comparing various processing services in AWS – (Glue /Spark/Lambda)
  • Comparing storages in AWS – (DynamoDB/DocumentDB/Amazon Keyspaces)
  • Comparing various analytics services in AWS – (Redshift/Athena/Snowflake)
  • Kinesis – collect data from various sources
  • Amazon S3 – Build DataLake
  • Amazon EMR – Run Spark for Data processing
  • Amazon Glue – Creating Data Catalog on S3
  • Amazon Athena – Running Analytics
  • Aws Lambda – Creating Serverless functions to process data
  • DynamoDB – Storing processed data in nosql database
Edit Content

Basic knowledge of AWS

  • For experiencing the hands-on during the class, it is recommended that each Participant should have an AWS account, Snowflake account and access to Talend Cloud Data Integration

Connect

we'd love to have your feedback on your experience so far