Data and Machine Learning Fundamentals

A getting started voyage to the realm of Machine Learning with pragmatic hands-on experience

Duration

5 Days

Level

Beginner Level

Design and Tailor this course

As per your team needs

Edit Content

This course is a stepping stone for the “Machine Learning and Artificial Intelligence” learning path. It has been designed and developed for creating a base for the next level of courses in the above path. Below points provide high level overview about the course –

Understand role of Data and Machine Learning
Use cases of Machine Learning
Introduction to some of the key technologies in Data and Machine Learning
Providing hands-on experience in Data Acquisition, Processing, Analysis and Modeling using Python programming language
The participants will deal with various common types of data e.g. CSV, Web data, Social Media data etc. for pre-processing and/or building Machine Learning Models
During the course, the participants will also get exposure to perform Exploratory data analysis along with learning basic statistics

Edit Content

Understanding the Big Picture

Significance of Data
What is Machine Learning (ML)?
Practical Use cases
Concepts and Terms
Tools/Platforms for ML
Machine Learning End to End Pipeline
Roles and Responsibilities of Data Engineer and Data Scientist

Environment for Experiments

Installing Anaconda
Setting up Jupyter Notebook
Experiencing Notebooks
Introduction to Google Colab
Hands-on Exercise(s)

Python for DataScience

Content Acquisition Approaches, Pros & Cons
Working with Beautiful Soup
Acquiring data using Rest Based APIs
Connecting to External data sources
Working with datasets
Manipulating the datasets
Exporting the datasets into external files

Acquiring & Exporting Data

Basic Statistics

Population and Sample
Data Types
Measures of Central tendency
Measures of dispersion
Percentiles & Quartiles
Box plots and outlier detection
Creating Graphs and Reporting
Probability Distributions
Hypothesis testing
Hands-on Exercise(s)

NumPy Basics

Dealing with One-dimensional Arrays
Dealing with Multi-dimensional Arrays
Working with NumPy Array
NumPy Arrays Compared to Python Lists
Manipulating Arrays
Hands-on Exercise(s)

Pandas Basics

Basic types – Series and DataFrames
Working with a Series
Element-wise Operations
Creating a DataFrame from various sources e.g. CSV
Data Manipulation using Pandas
Hands-on Exercise(s)

Data Visualization

Overview
Key types of plots
Exploratory Analysis using MatPlot Lob
Hands-on Exercises
Introduction to Seaborn
Seaborn foundation
Key types of plots
Customizing Seaborn Plots
Hands-on Exercise(s)

Data Preparation for Analysis

Exploratory Data Analysis
Data Cleaning techniques
- Deal with missing data
- Add default values
- Remove incomplete rows
- Deal with error-prone columns
- Fixing the nan values and string/float confusion
Data Preparation for ML
- Normalize data types
- Feature Scaling
- Feature Standardization
- Label Encoding
- One-Hot Encoding
Hands-on Exercise(s)

Feature Engineering

What is Feature Engineering?
Why Feature Engineering?
How to apply Feature Engineering?
Discussions on various scenarios
Hands-on Exercise(s)

Machine Learning using Scikit Learn

Types of Machine Learning
Key Algorithms in Machine Learning
Practical Applications of Machine Learning
Various frameworks/Libraries popular for ML
Concepts and Terms
Why Scikit Learn?
Code Walkthrough
Hands-on Exercise(s)

Supervised Machine Learning

Key Classification Algorithms
Conditional Probability
Proof of Bayes Theorem
Naïve Bayes Classifier
Confusion Matrix
Accuracy
Key Regression Algorithms
Linear, Logistic and Other Key types of Regressions
Decision Trees
Ensemble Learning – Random Forest
Gradient Descent
Loss function
Bias vs Variance Tradeoff
Confusion Matrix
Evaluating Models
Hyper Parameter Tuning
Hands-on Exercise(s)

Un-Supervised Machine Learning

Key types of Unsupervised ML
Principal Component Analysis
Performing Clustering of data
Hands-on Exercise(s)

Labs:

Basic Python
- Regular-expression
- Higher Order Functions
- Nested-statements-and-scope
- User defined Functions
- Lambda-expressions
- Multiple exercises
Numpy
- Understanding-data-types
- Numpy-indexing-selection
- Numpy-arrays
- Sorting
- Numpy-operations
Pandas
- Series
- Operations
- Merging-joining-concatenation
- Missing-data
- Dataframes
- Data-input-output
- Groupby
Descriptive Statistics
Social data analysis
Data Acquisition
Data preprocessing and feature exploration
Matplotlib
- Matplotlib-overview
- Settings-and-stylesheets
- Multiple-subplots
- Simple-scatter-plots
- Histograms
- Visualization-with-seaborn
- Simple-line-plots
- Three-dimensional-plotting
- customizing-legends
Seaborn
- Categorical-plots
- Regression-plots
- Style-and-color
- Matrix-plots
- Distribution-plots
- Grids
Scikit learn
- Linear regression
- Logistic regression
- K means clustering
- Principal component analysis

Edit Content

FIND YOUR COURSE

Topics

Brands

Data and Machine Learning Fundamentals

Duration

Level

Design and Tailor this course

Quick Links

our Offerings

Get in touch

Sign up for DataCouch Communications

Connect

we'd love to have your feedback on your experience so far