Big data systems are becoming more and more complex each day. Even simpler big data system involves various stages such as ingestion, transformations, analytics etc. and also involves various stakeholders such as big data engineers, data scientists and data analytics. So it becomes necessary to stitch together all big data tasks into a pipeline and monitor them to make them a more scalable, less error prone and autonomous system.
The Creating & Monitoring Big Data Pipelines with Apache Airflow training course is designed to teach data engineers what they need to know to create, schedule and monitor data pipelines using the de facto platform known as Apache Airflow by programmatically authoring, scheduling and creating workflows. The course begins with the core functionalities of Apache Airflow and then moves on to building data pipelines. The course then moves into more advanced topics around Apache Airflow such as start_date and schedule_time, dealing with time zones, alerting on failures and much more. The course concludes with a look at how to handle monitoring and security with Apache Airflow, as well as managing and deploying workflows in the cloud.
Promote an in-depth understanding of how to use Apache Airflow to create, schedule and monitor data pipelines.
Upon completion of this course, you should be able to:
Code production-grade data pipelines with Airflow
Scheduling & monitoring data pipelines using Apache Airflow
Understand and apply core/advanced concepts of Apache Airflow.
Create data pipelines using AWS MWAA (Managed Workflow for Apache Airflow)
Apache, Apache Kafka, Apache Spark, Kafka, Spark and other associated open source project names are trademarks of the Apache Software Foundation. DataCouch is not affiliated with, endorsed by, or otherwise associated with the Apache Software Foundation (ASF) or any of their projects.
we'd love to have your feedback on your experience so far