Machine Learning with Python Scikit Learn
A getting started voyage to the realm of Machine Learning with pragmatic hands-on experience

Duration
5 Days
Level
Intermediate Level
Design and Tailor this course
As per your team needs
Edit
This course is a stepping stone for the “Machine Learning and Artificial Intelligence” learning path. It has been designed and developed for creating a base for the next level of courses in the above path. Below points provide high level overview about the course –
- Understand role of Data and Machine Learning
- Use cases of Machine Learning
- Introduction to some of the key technologies in Data and Machine Learning
- Providing hands-on experience in Data Acquisition, Processing, Analysis and Modeling using Python programming language
- The participants will deal with various common types of data e.g. CSV, Web data, Social Media data etc. for pre-processing and/or building Machine Learning Models
- During the course, the participants will also get exposure to perform Exploratory data analysis along with learning basic statistics
Edit
This program is designed for those who aspire for Data/ML/AI roles:
- Data Engineers
- Data Scientists
- Machine Learning Engineers
- Data Integration Engineers
- Data Architects
Edit
- Significance of Data
- What is Machine Learning (ML)?
- Practical Use cases
- Concepts and Terms
- Tools/Platforms for ML
- Machine Learning End to End Pipeline
- Roles and Responsibilities of Data Engineer and Data Scientist
- Installing Anaconda
- Setting up Jupyter Notebook
- Experiencing Notebooks
- Introduction to Google Colab
- Hands-on Exercise(s)
- Python Overview
- Basic Syntax
- Functions in Python
- Lambda Function
- Dealing with Semi-structured data
- Higher Order Functions
- User defined Functions
- Hands-on Exercise(s)
- Content Acquisition Approaches, Pros & Cons
- Working with Beautiful Soup
- Acquiring data using Rest Based APIs
- Connecting to External data sources
- Working with datasets
- Manipulating the datasets
- Exporting the datasets into external files
- Population and Sample
- Data Types
- Measures of Central tendency
- Measures of dispersion
- Percentiles & Quartiles
- Box plots and outlier detection
- Creating Graphs and Reporting
- Probability Distributions
- Hypothesis testing
- Hands-on Exercise(s)
- Dealing with One-dimensional Arrays
- Dealing with Multi-dimensional Arrays
- Working with NumPy Array
- NumPy Arrays Compared to Python Lists
- Manipulating Arrays
- Hands-on Exercise(s)
- Basic types – Series and DataFrames
- Working with a Series
- Element-wise Operations
- Creating a DataFrame from various sources e.g. CSV
- Data Manipulation using Pandas
- Hands-on Exercise(s)
- Overview
- Key types of plots
- Exploratory Analysis using MatPlot Lob
- Hands-on Exercises
- Introduction to Seaborn
- Seaborn foundation
- Key types of plots
- Customizing Seaborn Plots
- Hands-on Exercise(s)
- Exploratory Data Analysis
- Data Cleaning techniques
- Deal with missing data
- Add default values
- Remove incomplete rows
- Deal with error-prone columns
- Fixing the nan values and string/float confusion
- Data Preparation for ML
- Normalize data types
- Feature Scaling
- Feature Standardization
- Label Encoding
- One-Hot Encoding
- Hands-on Exercise(s)
- What is Feature Engineering?
- Why Feature Engineering?
- How to apply Feature Engineering?
- Discussions on various scenarios
- Hands-on Exercise(s)
- Types of Machine Learning
- Key Algorithms in Machine Learning
- Practical Applications of Machine Learning
- Various frameworks/Libraries popular for ML
- Concepts and Terms
- Why Scikit Learn?
- Code Walkthrough
- Hands-on Exercise(s)
- Key Classification Algorithms
- Conditional Probability
- Proof of Bayes Theorem
- Naïve Bayes Classifier
- Confusion Matrix
- Accuracy
- Key Regression Algorithms
- Linear, Logistic and Other Key types of Regressions
- Decision Trees
- Ensemble Learning – Random Forest
- Gradient Descent
- Loss function
- Bias vs Variance Tradeoff
- Confusion Matrix
- Evaluating Models
- Hyper Parameter Tuning
- Hands-on Exercise(s)
- Key types of Unsupervised ML
- Principal Component Analysis
- Performing Clustering of data
- Hands-on Exercise(s)
- Understanding Ensemble Learning
- Types of Ensemble Learning
- Stacking
- Bagging
- Boosting
- Random Forest
- How do these work?
- Hands-on Code Walkthrough
- GBMs, XGBoost, LightGBM etc.
- Hands-on Exercises
- Hyperparameter Tuning
- Feature Selection using Random Forest
- Intuition behind KNN
- Maths behind KNN
- How to determine K?
- Definition of Distance
- Pros & Cons of KNN
- Hands-on Case Study
- Conditional Probability
- Proof of Bayes Theorem
- Naïve Bayes Classifier
- Pro & Cons of Naïve Bayes
- Key Regression Algorithms
- Implementing Spam Classifier
- Natural Language Processing
- Vectorizers, Pros & Cons
- NLP Case Study
- Hands-on Exercise(s)
- Production Quality Code
- Deploying Model on Google AI Platform
- Hackathon
Edit
Participants should preferably have some hands-on experience in programming language. Knowledge of Python would be a plus.