As per your team needs
This course is an advanced level course for tuning Spark SQL (batch) applications. The participants will learn –
PySpark hands-on exercises will be performed in Jupyter notebooks integrated with Spark 2.4.x version. This setup will be installed in Pseudo distributed mode on Cloudera platform.
The participants should have at-least a couple of months of experience developing Spark SQL applications. Knowledge of Hive will be a plus.
Apache, Apache Kafka, Apache Spark, Apache Trino, Apache Iceberg, Apache Hive, Kafka, Spark, Trino, Iceberg, Hive, and other associated open-source project names are the Apache Software Foundation trademarks. Starburst, Starburst Data, Starburst Enterprise, and Starburst Galaxy are registered trademarks of Starburst Data, Inc. All rights reserved. DataCouch is not affiliated with, endorsed by, or otherwise associated with the Apache Software Foundation (ASF) or any of its projects.