The parquet tabular data storage format has become one of the most ubiquitous, particularly in "big data" contexts where it is arguably the only binary format to successfully supplant CSV. Despite this, there are relatively few implementations of parquet, which, historically, has presented challenges for Julia. I will give a brief overview of Parquet2.jl, a pure Julia parquet implementation including comparison to other tools and formats and what is still needed to reach parity with pyarrow.
We will touch on the following:
pyarrow
implementation?