Machine Learning with Python Scikit Learn

A getting started voyage to the realm of Machine Learning with pragmatic hands-on experience


5 Days


Intermediate Level

Design and Tailor this course

As per your team needs


This course is a stepping stone for the “Machine Learning and Artificial Intelligence” learning path. It has been designed and developed for creating a base for the next level of courses in the above path. Below points provide high level overview about the course –

  • Understand role of Data and Machine Learning  
  • Use cases of Machine Learning
  • Introduction to some of the key technologies in Data and Machine Learning 
  • Providing hands-on experience in Data Acquisition, Processing, Analysis and Modeling using Python programming language
  • The participants will deal with various common types of data e.g. CSV, Web data, Social Media data etc. for pre-processing and/or building Machine Learning Models
  • During the course, the participants will also get exposure to perform Exploratory data analysis along with learning basic statistics

This program is designed for those who aspire for Data/ML/AI roles:

  • Data Engineers
  • Data Scientists
  • Machine Learning Engineers
  • Data Integration Engineers
  • Data Architects
  • Significance of Data
  • What is Machine Learning (ML)?
  • Practical Use cases
  • Concepts and Terms
  • Tools/Platforms for ML
  • Machine Learning End to End Pipeline
  • Roles and Responsibilities of Data Engineer and Data Scientist
  • Installing Anaconda
  • Setting up Jupyter Notebook
  • Experiencing Notebooks
  • Introduction to Google Colab
  • Hands-on Exercise(s)
  • Python Overview
  • Basic Syntax
  • Functions in Python
  • Lambda Function
  • Dealing with Semi-structured data
  • Higher Order Functions
  • User defined Functions
  • Hands-on Exercise(s)
  • Content Acquisition Approaches, Pros & Cons
  • Working with Beautiful Soup
  • Acquiring data using Rest Based APIs
  • Connecting to External data sources 
  • Working with datasets
  • Manipulating the datasets 
  • Exporting the datasets into external files
  • Population and Sample
  • Data Types
  • Measures of Central tendency
  • Measures of dispersion
  • Percentiles & Quartiles
  • Box plots and outlier detection
  • Creating Graphs and Reporting
  • Probability Distributions 
  • Hypothesis testing 
  • Hands-on Exercise(s)
  • Dealing with One-dimensional Arrays
  • Dealing with Multi-dimensional Arrays
  • Working with NumPy Array
  • NumPy Arrays Compared to Python Lists
  • Manipulating Arrays
  • Hands-on Exercise(s)
  • Basic types – Series and DataFrames
  • Working with a Series
  • Element-wise Operations
  • Creating a DataFrame from various sources e.g. CSV
  • Data Manipulation using Pandas
  • Hands-on Exercise(s)
  • Overview
  • Key types of plots
  • Exploratory Analysis using MatPlot Lob
  • Hands-on Exercises
  • Introduction to Seaborn
  • Seaborn foundation
  • Key types of plots
  • Customizing Seaborn Plots
  • Hands-on Exercise(s)
  • Exploratory Data Analysis
  • Data Cleaning techniques
    • Deal with missing data
    • Add default values
    • Remove incomplete rows
    • Deal with error-prone columns
    • Fixing the nan values and string/float confusion
  • Data Preparation for ML
    • Normalize data types
    • Feature Scaling
    • Feature Standardization
    • Label Encoding
    • One-Hot Encoding
  • Hands-on Exercise(s)
  • What is Feature Engineering?
  • Why Feature Engineering?
  • How to apply Feature Engineering?
  • Discussions on various scenarios
  • Hands-on Exercise(s)
  • Types of Machine Learning
  • Key Algorithms in Machine Learning
  • Practical Applications of Machine Learning
  • Various frameworks/Libraries popular for ML
  • Concepts and Terms
  • Why Scikit Learn?
  • Code Walkthrough
  • Hands-on Exercise(s)
  • Key Classification Algorithms
  • Conditional Probability 
  • Proof of Bayes Theorem
  • Naïve Bayes Classifier
  • Confusion Matrix
  • Accuracy
  • Key Regression Algorithms
  • Linear, Logistic and Other Key types of Regressions
  • Decision Trees
  • Ensemble Learning –  Random Forest
  • Gradient Descent
  • Loss function
  • Bias vs Variance Tradeoff
  • Confusion Matrix
  • Evaluating Models
  • Hyper Parameter Tuning
  • Hands-on Exercise(s)
  • Key types of Unsupervised ML
  • Principal Component Analysis
  • Performing Clustering of data
  • Hands-on Exercise(s)
  • Understanding Ensemble Learning
  • Types of Ensemble Learning
  • Stacking
  • Bagging
  • Boosting
  • Random Forest
  • How do these work?
  • Hands-on Code Walkthrough
  • GBMs, XGBoost, LightGBM etc.
  • Hands-on Exercises
  • Hyperparameter Tuning
  • Feature Selection using Random Forest
  • Intuition behind KNN
  • Maths behind KNN
  • How to determine K?
  • Definition of Distance
  • Pros & Cons of KNN
  • Hands-on Case Study
  • Conditional Probability
  • Proof of Bayes Theorem
  • Naïve Bayes Classifier
  • Pro & Cons of Naïve Bayes
  • Key Regression Algorithms
  • Implementing Spam Classifier
  • Natural Language Processing
  • Vectorizers, Pros & Cons
  • NLP Case Study
  • Hands-on Exercise(s)
  • Production Quality Code
  • Deploying Model on Google AI Platform
  • Hackathon

Participants should preferably have some hands-on experience in programming language. Knowledge of Python would be a plus.


we'd love to have your feedback on your experience so far