Optimizing Data Lakehouses With Starburst

Best Practices for High-Performance Data Lakehouse Architectures

Duration

2 Days

Level

Intermediate to Advanced Level

Design and Tailor this course

As per your team needs

Overview

This 2-day course comprises instructor-led discussions, demonstrations, and hands-on exercises designed to build a working knowledge of the Starburst query engine. Participants will gain a more thorough awareness of Starburst architecture, focusing on best practices for data lake based schemas, including table formats and partitioning, file formats and sizes, and other optimization techniques.

Audience

This course is designed for:-

  • Data engineers
  • Data architects
  • Experienced data analysts and data scientists

Prerequisites

Intermediate experience with SQL is assumed.

Curriculum

  • Overview & architecture
  • Web UI
  • Connectors & catalogs
  • Client tools integrations
  • Separation of storage & compute
  • Schema on read
  • Limit Data Exchanges
  • File format options
  • Small files problem
  • Partitioning & bucketing
  • Moving beyond Hive
  • Compare/contrast alternatives
  • Table format architecture
  • Creating tables
  • Insert, update & delete
  • CDC with merge
  • Schema & partition evolution
  • Snapshots & compaction
  • Benefits of statistics
  • Query plan analysis

Duration

2 Days

Level

Intermediate to Advanced Level

Design and Tailor this course

As per your team needs

Let’s Build Your Growth Ecosystem.

Get in touch