As per your team needs
This course is an advanced level course for tuning Spark SQL (batch) applications. The participants will learn –
PySpark hands-on exercises will be performed in Jupyter notebooks integrated with Spark 2.4.x version. This setup will be installed in Pseudo distributed mode on Cloudera platform.
The participants should have at-least a couple of months of experience developing Spark SQL applications. Knowledge of Hive will be a plus.
Apache, Apache Kafka, Kafka, and other associated open source project names are trademarks of the Apache Software Foundation. DataCouch is not affiliated with, endorsed by, or otherwise associated with the Apache Software Foundation (ASF) or any of their projects.