DataStore I/O Operations
DataStore supports reading from and writing to various file formats and data sources.
Reading Data
CSV Files
Examples:
Parquet Files
Recommended for large datasets - columnar format with better compression.
Examples:
JSON Files
Examples:
Excel Files
Examples:
SQL Databases
Examples:
Other Formats
Writing Data
to_csv
Export to CSV format.
Examples:
to_parquet
Export to Parquet format (recommended for large data).
Examples:
to_json
Export to JSON format.
Examples:
to_excel
Export to Excel format.
Examples:
to_sql
Export to SQL database or generate SQL string.
Examples:
Other Export Methods
File Format Comparison
|Format
|Read Speed
|Write Speed
|File Size
|Schema
|Best For
|Parquet
|Fast
|Fast
|Small
|Yes
|Large datasets, analytics
|CSV
|Medium
|Fast
|Large
|No
|Compatibility, simple data
|JSON
|Slow
|Medium
|Large
|Partial
|APIs, nested data
|Excel
|Slow
|Slow
|Medium
|Partial
|Sharing with non-tech users
|Feather
|Very Fast
|Very Fast
|Medium
|Yes
|Inter-process, pandas
Recommendations
-
For analytics workloads: Use Parquet
- Columnar format allows reading only needed columns
- Excellent compression
- Preserves data types
-
For data exchange: Use CSV or JSON
- Universal compatibility
- Human-readable
-
For pandas interop: Use Feather or Arrow
- Fastest serialization
- Type preservation
Compression Support
Reading Compressed Files
Writing Compressed Files
Compression Options
|Compression
|Speed
|Ratio
|Use Case
snappy
|Very Fast
|Low
|Default for Parquet
lz4
|Very Fast
|Low
|Speed priority
gzip
|Medium
|High
|Compatibility
zstd
|Fast
|Very High
|Best balance
bz2
|Slow
|Very High
|Maximum compression
Streaming I/O
For very large files that don't fit in memory:
Chunked Reading
Using ClickHouse Streaming
Remote Data Sources
HTTP/HTTPS
S3
GCS, Azure, HDFS
See Factory Methods for cloud storage options.