Build Clickhouse with DEFLATE_QPL
-
Make sure your host machine meet the QPL required prerequisites
-
deflate_qpl is enabled by default during cmake build. In case you accidentally change it, please double-check build flag: ENABLE_QPL=1
-
For generic requirements, please refer to Clickhouse generic build instructions
Run Benchmark with DEFLATE_QPL
Files list
The folders benchmark_sample
under qpl-cmake give example to run benchmark with python scripts:
client_scripts
contains python scripts for running typical benchmark, for example:
client_stressing_test.py
: The python script for query stress test with [1~4] server instances.queries_ssb.sql
: The file lists all queries for Star Schema Benchmarkallin1_ssb.sh
: This shell script executes benchmark workflow all in one automatically.
database_files
means it will store database files according to lz4/deflate/zstd codec.
Run benchmark automatically for Star Schema:
After complete, please check all the results in this folder:./output/
In case you run into failure, please manually run benchmark as below sections.
Definition
[CLICKHOUSE_EXE] means the path of clickhouse executable program.
Environment
- CPU: Sapphire Rapid
- OS Requirements refer to System Requirements for QPL
- IAA Setup refer to Accelerator Configuration
- Install python modules:
[Self-check for IAA]
Expected output like this:
If you see nothing output, it means IAA is not ready to work. Please check IAA setup again.
Generate raw data
Use dbgen
to generate 100 million rows data with the parameters:
-s 20
The files like *.tbl
are expected to output under ./benchmark_sample/rawdata_dir/ssb-dbgen
:
Database setup
Set up database with LZ4 codec
Here you should see the message Connected to ClickHouse server
from console which means client successfully setup connection with server.
Complete below three steps mentioned in Star Schema Benchmark
- Creating tables in ClickHouse
- Inserting data. Here should use
./benchmark_sample/rawdata_dir/ssb-dbgen/*.tbl
as input data. - Converting "star schema" to de-normalized "flat schema"
Set up database with IAA Deflate codec
Complete three steps same as lz4 above
Set up database with ZSTD codec
Complete three steps same as lz4 above
[self-check] For each codec(lz4/zstd/deflate), please execute below query to make sure the databases are created successfully:
You are expected to see below output:
[Self-check for IAA Deflate codec]
At the first time you execute insertion or query from client, clickhouse server console is expected to print this log:
If you never find this, but see another log as below:
That means IAA devices is not ready, you need check IAA setup again.
Benchmark with single instance
- Before start benchmark, Please disable C6 and set CPU frequency governor to be
performance
- To eliminate impact of memory bound on cross sockets, we use
numactl
to bind server on one socket and client on another socket. - Single instance means single server connected with single client
Now run benchmark for LZ4/Deflate/ZSTD respectively:
LZ4:
IAA deflate:
ZSTD:
Now three logs should be output as expected:
How to check performance metrics:
We focus on QPS, please search the keyword: QPS_Final
and collect statistics
Benchmark with multi-instances
- To reduce impact of memory bound on too much threads, We recommend run benchmark with multi-instances.
- Multi-instance means multiple(2 or 4)servers connected with respective client.
- The cores of one socket need to be divided equally and assigned to the servers respectively.
- For multi-instances, must create new folder for each codec and insert dataset by following the similar steps as single instance.
There are 2 differences:
- For client side, you need launch clickhouse with the assigned port during table creation and data insertion.
- For server side, you need launch clickhouse with the specific xml config file in which port has been assigned. All customized xml config files for multi-instances has been provided under ./server_config.
Here we assume there are 60 cores per socket and take 2 instances for example. Launch server for first instance LZ4:
ZSTD:
IAA Deflate:
[Launch server for second instance]
LZ4:
ZSTD:
IAA Deflate:
Creating tables && Inserting data for second instance
Creating tables:
Inserting data:
- [TBL_FILE_NAME] represents the name of a file named with the regular expression: *. tbl under
./benchmark_sample/rawdata_dir/ssb-dbgen
. --port=9001
stands for the assigned port for server instance which is also defined in config_lz4_s2.xml/config_zstd_s2.xml/config_deflate_s2.xml. For even more instances, you need replace it with the value: 9002/9003 which stand for s3/s4 instance respectively. If you don't assign it, the port is 9000 by default which has been used by first instance.
Benchmarking with 2 instances
LZ4:
ZSTD:
IAA deflate
Here the last argument: 2
of client_stressing_test.py stands for the number of instances. For more instances, you need replace it with the value: 3 or 4. This script support up to 4 instances/
Now three logs should be output as expected:
How to check performance metrics:
We focus on QPS, please search the keyword: QPS_Final
and collect statistics
Benchmark setup for 4 instances is similar with 2 instances above. We recommend use 2 instances benchmark data as final report for review.
Tips
Each time before launch new clickhouse server, please make sure no background clickhouse process running, please check and kill old one:
By comparing the query list in ./client_scripts/queries_ssb.sql with official Star Schema Benchmark, you will find 3 queries are not included: Q1.2/Q1.3/Q3.4 . This is because cpu utilization% is very low < 10% for these queries which means cannot demonstrate performance differences.