Data and Machine Learning Fundamentals
A getting started voyage to the realm of Machine Learning with pragmatic hands-on experience

Duration
5 Days
Level
Beginner Level
Design and Tailor this course
As per your team needs
Edit
This course is a stepping stone for the “Machine Learning and Artificial Intelligence” learning path. It has been designed and developed for creating a base for the next level of courses in the above path. Below points provide high level overview about the course –
- Understand role of Data and Machine Learning
- Use cases of Machine Learning
- Introduction to some of the key technologies in Data and Machine Learning
- Providing hands-on experience in Data Acquisition, Processing, Analysis and Modeling using Python programming language
- The participants will deal with various common types of data e.g. CSV, Web data, Social Media data etc. for pre-processing and/or building Machine Learning Models
- During the course, the participants will also get exposure to perform Exploratory data analysis along with learning basic statistics
Edit
This program is designed for those who aspire for Data/ML/AI roles:
- Data Engineers
- Data Scientists
- Machine Learning Engineers
- Data Integration Engineers
- Data Architects
Edit
- Significance of Data
- What is Machine Learning (ML)?
- Practical Use cases
- Concepts and Terms
- Tools/Platforms for ML
- Machine Learning End to End Pipeline
- Roles and Responsibilities of Data Engineer and Data Scientist
- Installing Anaconda
- Setting up Jupyter Notebook
- Experiencing Notebooks
- Introduction to Google Colab
- Hands-on Exercise(s)
- Content Acquisition Approaches, Pros & Cons
- Working with Beautiful Soup
- Acquiring data using Rest Based APIs
- Connecting to External data sources
- Working with datasets
- Manipulating the datasets
- Exporting the datasets into external files
- Population and Sample
- Data Types
- Measures of Central tendency
- Measures of dispersion
- Percentiles & Quartiles
- Box plots and outlier detection
- Creating Graphs and Reporting
- Probability Distributions
- Hypothesis testing
- Hands-on Exercise(s)
- Dealing with One-dimensional Arrays
- Dealing with Multi-dimensional Arrays
- Working with NumPy Array
- NumPy Arrays Compared to Python Lists
- Manipulating Arrays
- Hands-on Exercise(s)
- Basic types – Series and DataFrames
- Working with a Series
- Element-wise Operations
- Creating a DataFrame from various sources e.g. CSV
- Data Manipulation using Pandas
- Hands-on Exercise(s)
- Overview
- Key types of plots
- Exploratory Analysis using MatPlot Lob
- Hands-on Exercises
- Introduction to Seaborn
- Seaborn foundation
- Key types of plots
- Customizing Seaborn Plots
- Hands-on Exercise(s)
- Exploratory Data Analysis
- Data Cleaning techniques
- Deal with missing data
- Add default values
- Remove incomplete rows
- Deal with error-prone columns
- Fixing the nan values and string/float confusion
- Data Preparation for ML
- Normalize data types
- Feature Scaling
- Feature Standardization
- Label Encoding
- One-Hot Encoding
- Hands-on Exercise(s)
- What is Feature Engineering?
- Why Feature Engineering?
- How to apply Feature Engineering?
- Discussions on various scenarios
- Hands-on Exercise(s)
- Types of Machine Learning
- Key Algorithms in Machine Learning
- Practical Applications of Machine Learning
- Various frameworks/Libraries popular for ML
- Concepts and Terms
- Why Scikit Learn?
- Code Walkthrough
- Hands-on Exercise(s)
- Key Classification Algorithms
- Conditional Probability
- Proof of Bayes Theorem
- Naïve Bayes Classifier
- Confusion Matrix
- Accuracy
- Key Regression Algorithms
- Linear, Logistic and Other Key types of Regressions
- Decision Trees
- Ensemble Learning – Random Forest
- Gradient Descent
- Loss function
- Bias vs Variance Tradeoff
- Confusion Matrix
- Evaluating Models
- Hyper Parameter Tuning
- Hands-on Exercise(s)
- Key types of Unsupervised ML
- Principal Component Analysis
- Performing Clustering of data
- Hands-on Exercise(s)
- Basic Python
- Regular-expression
- Higher Order Functions
- Nested-statements-and-scope
- User defined Functions
- Lambda-expressions
- Multiple exercises
- Numpy
- Understanding-data-types
- Numpy-indexing-selection
- Numpy-arrays
- Sorting
- Numpy-operations
- Pandas
- Series
- Operations
- Merging-joining-concatenation
- Missing-data
- Dataframes
- Data-input-output
- Groupby
- Descriptive Statistics
- Social data analysis
- Data Acquisition
- Data preprocessing and feature exploration
- Matplotlib
- Matplotlib-overview
- Settings-and-stylesheets
- Multiple-subplots
- Simple-scatter-plots
- Histograms
- Visualization-with-seaborn
- Simple-line-plots
- Three-dimensional-plotting
- customizing-legends
- Seaborn
- Categorical-plots
- Regression-plots
- Style-and-color
- Matrix-plots
- Distribution-plots
- Grids
- Scikit learn
- Linear regression
- Logistic regression
- K means clustering
- Principal component analysis
Edit
Participants should preferably have some hands-on experience in programming language. Knowledge of Python would be a plus.