Lorem ipsum dolor sit amet, conse ctetur adip elit, pellentesque turpis.

  • No products in the cart.

Image Alt


  /    /  Cassandra


Big Data

Apache Cassandra is a free and open-source distributed database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple data centers, with asynchronous masterless replication allowing low latency operations for all clients.

Overview of the Course

This course will introduce you to Apache Cassandra – one of the most popular NoSql database. After completing this course, you should be able to to

    • Understand Cassandra and No-SQL use cases
    • Create Cassandra cluster for various use cases
    • Understand Cassandra architecture
    • Design and data modeling on Cassandra
    • How to port existing application from RDBMS to Cassandra
    • Integrate Cassandra with external systems such as Java application, Apache Spark

This program is designed for:

    • Software Developer
    • Data Scientist
    • Data Engineer
    • Big Data Engineer
Introduction to Cassandra
    • What is Cassandra and its evolution
    • Industry landscape of databases and Cassandra use cases
    • Distributed and decentralized
    • Elastic scalability
    • Fault tolerance and high availability
    • Consistency, Availability and Persistence (CAP)
Cassandra Architecture
    • Cassandra inside look
    • Gossip and failure detection
    • Rings and tokens
    • Virtual nodes
    • Partitions
    • Replication
    • Consistency levels
    • Queries and Coordinator nodes
    • Caching, memtables, SSTables, commit logs
    • Bloom filters
    • Compaction
    • Managers and Services
Getting Started with Cassandra
    • Installing Cassandra
    • Basic cqlsh commands
    • Understanding the environment
    • Creating key space and table in cqlsh
    • Read/Write data using cqlsh
Cassandra Query Language
    • Cassandra Building Blocks– clusters, key-space, table, column
    • CQL Data Types
Lab Exercises
    • Setting up a single node Cassandra DB
    • Bulk loading data into Cassandra
    • Working with CQLSH commands
    • Build materialize view and lookup on keys
    • Build index on tables
    • Working with Dev Center
Advanced Configuration
    • Cassandra cluster manager
    • Seed nodes
    • Partitioners
    • Snitches
    • Node configuration
    • Resizing cluster
    • Dynamic ring participation
    • Replication strategy
Reading Data
    • Read consistency level
    • Read path
    • Read repair
    • Functions and aggregation
    • Paging
    • Speculative retry
Lab Exercises
    • Setting up multi node Cassandra DB
    • Setting Multi Data Center Cassandra DB
    • Take full snapshot and restore from snapshot
    • Enable incremental backup and restore from incremental backup
Data Modeling
    • Data modeling in Cassandra
    • Application Queries
    • Logical data modeling to physical data modeling
    • Evaluating and refining partitions
Lab Exercises
    • Build data model of a toy use case (build a hotel booking site using Cassandra)
    • Optimizing logical and physical data model
    • Examine Logs
    • Monitoring using JMX, Mbeans
    • Monitoring with nodetool
Performance Tuning
    • Setting performance goals
    • Analyzing performance issues
    • Tune configuration
Lab Exercises
    • Monitoring Cassandra using nodetool and JMX
    • Monitoring Cassandra logs
    • Use tracing to find to localize performance issues
    • Stress testing Cassandra cluster
Cassandra Java SDK
    • Build web scale application using Cassandra operational databases
    • Integrate application using Cassandra Java SDK
    • API end points
    • Configure Cassandra connections
    • Use prepared statement
    • Session and connection pooling
    • Debugging and monitoring
Lab Exercises
    • Build spring boot application (Java) for REST api service using Cassandra as a database backend.
Introduction to Apache Spark for Analytics and ETL
    • Introduction to Apache Spark
    • Quick overview of Scala/Python programming required for Spark
    • Spark Architecture
    • Spark Data Structures – RDD, DataFrame and Datasets
    • In memory computation of Spark
    • Spark deployment options
    • ETL and Analytical workloads using Spark
    • SQL Join queries on Cassandra tables using Spark
    • Processing graph data using Spark
    • Manage data in Cassandra using Spark
Lab Exercises
    • Running complex join queries using Spark against Cassandra tables
    • Migrate data from RDBMS to Cassandra using Spark and vice versa
    • Build data virtualization
    • [Optional] Real time stream data from RDBMS to Cassandra using Kafka
Cassandra Java SDK
    • Working with Java SDK for Cassandra
    • ETL using Java application
Lab Exercises
    • Build a data loader application using Java SDK
Lab Exercises
    • Build a data loader application using Java SDK

Participants should have basic programming knowledge on Java/Scala/Python  

Course Information


4 Days

Mode of Delivery

Instructor led/Virtual



Reach out to us..Our representative will get back to you!

Fill up the form to download the course PDF

Your Name (required)

Your Email (required)

Phone (required)

Post a Comment

Need Help? Chat with us
Please accept our privacy policy first to start a conversation.