Getting Started with Starburst for Analysts
Duration
1 Day (8 hrs)
Level
Basic Level
Design and Tailor this course
Reach out to us
New to Starburst? Getting Started with Starburst for Analysts features live, instructor-led discussion, demonstrations, and interactive hands-on exercises will provide the knowledge required harness the benefits of running queries using Starburst. During class you will be exposed to the foundational concepts and technology that allows Starburst to access multiple, and varied, data sources in a federated manner. With this insight, you will utilize multiple scalar and aggregate functions to query data using Starburst.
Upon completion of this course, you will have learned how to:
● Differentiate components in a Starburst cluster
● Explain the differences between connectors, catalogs, schemas and tables/views
● Execute federated queries by joining multiple data sources
● Leverage Starburst to query multiple formats of file-based data
● Run a number of SQL functions relevant to customer use cases
● Take advantage of SQL nuances for feature and performance considerations
● Construct analytical-oriented queries for rollup and windowing functions
This course is designed for experienced data analysts, business intelligence specialists, data engineers, and data scientists.
- Starburst Overview
- Starburst Web UI
- Data Sources
- Client Tools Integration
- Starburst Architecture
- Federated Querying
- Using Approximations
- Building Data Rollups and Cubes
- Join Considerations
- Transforms and Aggregates
- Limiting Results
- Error Handling
- Decisioning
- Windowing basics
- RANK and row numbering
- LAG and LEAD Functions
- FIRST/LAST/NTH_VALUE
- Sliding Windows using RANGE or ROWS
- Navigate the hands-on lab interface
- Understand how a query is executed on the cluster
- Use Case: Query file-based data in a data lake
- SQL Functions: Explore Transforms and Aggregates
- Use SQL Functions to Limit Data Transfer
- Advanced SQL: Add Conditional Logic to a Query
- Best practice: Count approximate distinct values
- Best practice: Aggregate on multiple combinations of columns more efficiently with GROUPING SETS, CUBE, and ROLLUP
- Identify whether a query will use predicate pushdown
- Best practice: Use JOIN ON instead of JOIN USING
- Advanced SQL: Explore Window Functions
- Define the rows included in a window function using RANGE or ROWS
A minimum of intermediate experience with SQL is assumed.