Data Engineering

Machine Learning Path Recommendations

This is an incomplete, ever-changing curated list of content to assist people into the worlds of Data Science and Machine Learning. If you have

Differences between Kafka and Flume

Kafka is usually compared with Flume as both these technologies can be used in Data Ingestion phase of a Data Pipeline. In this article,

Introduction to Delta Lake

Introduction Data Lakes built using Hadoop framework were lacking a very basic functionality i.e. ACID compliance. Hive tried to overcome some of the limitations

Real-Time Clickstream Analysis using KsqlDB

Clickstream plays an important role in analyzing customer behavior. It also helps organizations in making future business strategies. So, let’s discuss real-time clickstream analysis

Big Data Processing using Google Dataproc

Introduction At present, about 2.5 quintillion bytes (2500 PetaBytes) of data is produced by humans every day (Source: Social Media Today). Processing this much

Key Features of Apache Spark 3.x

Introduction Apache Spark, a powerful data processing tool to counter the attacks of Big Data. It became the game changer once it became open-source

Categories

Trending posts

Subscribe

Sign up to receive our top tips and tricks.