#106 datashard: Add gzip compression as default for parquet files
Description
EditDataShard library (../datashard/) should use gzip compression by default when writing parquet files. This will reduce storage costs and improve I/O performance for task_logs and workflow_logs tables. Location: ~/develop/datashard/
use python `pip install lz4` for that. it must be it's main dependency. the implementaion should be transparent and users shouldn't even notice it. backward compatiblity is not needed, so you can delete the already available data and test it (also snapshots and meta data will point to new data and just time travel will be broken, anyways you can delete the data). fix the tests in ~/develop/datashard/ as well to support this (again it should work as it's internally will be handled during read and write).
Comments
Loading comments...
Context
Loading context...
Audit History
View AllLoading audit history...