High Performance DE Substack

High Performance DE Substack

Iceberg Tables Via Duck DB and Polars

Can I really ditch Spark to build my Iceberg Tables?

High Performance DE Newsletter's avatar
High Performance DE Newsletter
Sep 05, 2024
∙ Paid
12
Share

I’ve gotten some good practice lately creating and working with Iceberg tables, but there’s been a glaring problem. I’ve only been able to do so successfully with Spark. I’m not saying Spark is bad here, but for smaller workloads, there surely must be a simpler way to do this with less overhead, right?

In comes the PyIceberg python package. This package …

Keep reading with a 7-day free trial

Subscribe to High Performance DE Substack to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Matt Martin
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture