Apache Arrow is the defacto in-memory columnar data processing format. Its close cousin (parquet) is the defacto file standard for columnar data storage. What makes Arrow unique and sets it apart from older storage formats is that it’s highly optimized to crunch through millions, if not billions, of rows very quickly, especially when you need to do thin…
Keep reading with a 7-day free trial
Subscribe to High Performance DE Substack to keep reading this post and get 7 days of free access to the full post archives.


