Machine Learning using Spark

Build Scalable Machine Learning models using Spark ML

Duration

1 Day

Level

Advanced Level

Design and Tailor this course

As per your team needs

Edit Content

The program is focussed on applying Machine Learning in a Scalable way using Spark framework. You will learn the course curriculum through theory lectures, live demonstrations and lab exercises. This course will be taught in the Python programming language and we will be using Cloudera 7.x using Spark 3.x

Edit Content

This course is designed for : 

  • Data Scientists, Application developers, DevOps engineers, Architects, QAs, Technical Managers
  • Developers who are experienced with Spark
Edit Content
  • Problems with Traditional Machine Learning Frameworks
  • Machine Learning at Scale – Various options
  • Why Spark?
  • How Spark performs well for Iterative Machine Learning Algorithms?
  • Data Acquisition from various data sources
  • Data Cleansing/Processing at Scale
  • Feature Engineering – Feature Extraction, Scaling etc.
  • Modeling the problem
  • Evaluation Metrics
  • Spark ML vs Spark MLLib
  • Data types and key terms
  • Feature Extraction
  • Linear Regression using Spark MLLib
  • Hands-on Exercises
  • Spark ML Overview
  • Transformers and Estimators
  • Pipelines
  • Implementing Decision Trees
  • K-Means Clustering using Spark ML
  • Hands-on Exercises
Edit Content
  • Basic knowledge of Python 
  • Basic Knowledge of Apache Spark
  • Participants should preferably have basic knowledge of SQL, Scala/Java and Unix commands

Connect

we'd love to have your feedback on your experience so far