High Performance DE Substack

High Performance DE Substack

PyArrow Intro

Standing on the Shoulders of Giants

High Performance DE Newsletter's avatar
High Performance DE Newsletter
Feb 17, 2025
∙ Paid
9
2
Share

Apache Arrow is the defacto in-memory columnar data processing format. Its close cousin (parquet) is the defacto file standard for columnar data storage. What makes Arrow unique and sets it apart from older storage formats is that it’s highly optimized to crunch through millions, if not billions, of rows very quickly, especially when you need to do thin…

Keep reading with a 7-day free trial

Subscribe to High Performance DE Substack to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Matt Martin
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture