Welcome to Our House!

Come on in and stay a while!! We would love to introduce ourselves and say "Hello".

No thanks

You must have cookies enabled to view this lesson.

Getting Started with ClickHouse


Duration: 10 minutes

Audience: Anyone brand new to ClickHouse

Issues/Questions: Contact us at [email protected]

Overview: In this lesson, you will get ClickHouse up and running on your local machine, create a new database and table, and insert some data into that table.

Let’s get started!


Prerequisites:

  • You will need a Linux or Mac OS X machine
  • Optionally, you can use Docker
  • Windows users will need to either:
    • run ClickHouse within the WSL (Windows Subsystem for Linux)
    • install Docker on Windows
    • run Linux in a virtual machine using something like VirtualBox
    • create a Linux instance using your favorite cloud provider

1. Installing ClickHouse

There are several ways to install ClickHouse, including DEB and RPM packages. We also provide several pre-built binaries that you can simply download and run. Click on Show instructions to get started…

Show instructions
  1. In this lesson, you will download a pre-built binary of ClickHouse. Start by opening a terminal and creating a new folder for ClickHouse to run in:

    mkdir clickhouse
    cd clickhouse
    
  2. Find your OS in the following table, then copy-and-paste the corresponding command to download a pre-built ClickHouse binary and make it executable:

    Operating SystemRun this command:
    MacOS x86_64curl -O 'https://builds.clickhouse.com/master/macos/clickhouse' && chmod a+x ./clickhouse
    MacOS Aarch64curl -O 'https://builds.clickhouse.com/master/macos-aarch64/clickhouse' && chmod a+x ./clickhouse
    FreeBSD x86_64curl -O 'https://builds.clickhouse.com/master/freebsd/clickhouse' && chmod a+x ./clickhouse
    Linux x86_64curl -O 'https://builds.clickhouse.com/master/amd64/clickhouse' && chmod a+x ./clickhouse
    Linux AArch64curl -O 'https://builds.clickhouse.com/master/aarch64/clickhouse' && chmod a+x ./clickhouse
    Dockerdocker run -d --name my-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server

You will start ClickHouse in the next step, unless you used the Docker command - in which case the ClickHouse server is already running inside the container.


2. Starting the ClickHouse Server

The clickhouse binary is used to run both the server and the client. Let’s start up the server…

Show instructions
  1. Run the following command to start the ClickHouse server:

    ./clickhouse server
    
  2. It will only take a few seconds for ClickHouse to start - look for the following message in the console output:

<Information> Application: Ready for connections.

  1. The first time you run the ClickHouse server, it creates a collection of folders and files. Open a new terminal window, change directories to your clickhouse folder, and view its contents:

    cd clickhouse
    ls -la
    

    Your folder list should look similar to the following:

    drwxr-x---  2 user user         64 <date> backups
    -rwxr-xr-x  1 user user  263775639 <date>  clickhouse
    drwxr-x---  4 user user        128 <date>  data
    drwxr-x---  2 user user         64 <date>  dictionaries_lib
    drwxr-x---  2 user user         64 <date>  flags
    drwxr-x---  2 user user         64 <date>  format_schemas
    drwxr-x---  8 user user        256 <date>  metadata
    drwxr-x---  2 user user         64 <date>  metadata_dropped
    drwxr-x---  3 user user         96 <date>  preprocessed_configs
    -rw-r-----  1 user user         59 <date>  status
    drwxr-x---  4 user user        128 <date>  store
    drwxr-x---  2 user user         64 <date>  tmp
    drwxr-x---  2 user user         64 <date>  user_defined
    drwxr-x---  2 user user         64 <date>  user_files
    drwxr-x---  2 user user         64 <date>  user_scripts
    -rw-r-----  1 user user         36 <date>  uuid
    

    That’s it! You now have the ClickHouse server up and running. Let’s see how to interact with it…


3. The ClickHouse Play UI

The ClickHouse server includes a Play UI for executing SQL commands. In this step you will see how to access the Play UI and run SQL commands.

Show instructions
  1. Go to http://localhost:8123/play to access the Play UI:

  1. You can simply type in SQL commands and click the Run button to execute them. For example, run the following command to view the pre-defined databases:

    SHOW DATABASES
    

    You should see 4 databases:

The default database is initially empty and is used for executing commands that do not specify a database. The system database is where ClickHouse stores the details about your ClickHouse deployment.

  1. Run the following command to view the system table names:

    SHOW TABLES IN system
    

    Notice there are over 60 tables.

ClickHouse supports an extended SQL-like language that you will quickly become comfortable with. For example, commands like CREATE and SELECT are essentially identical to ANSI SQL.


4. Defining a database and table

Defining a database will look very familiar. Defining a table involves some ClickHouse-specific details.

Show instructions
  1. You use the CREATE DATABASE command to create a new database. Run the following command in the Play UI to define a new database named gettingstarted:

    CREATE DATABASE gettingstarted
    
  2. You should see gettingstarted now in the list of databases:

    SHOW DATABASES
    
  3. Creating a table is a bit different in that ClickHouse has its own data types, and every table must specify an Engine property that determines the type of table to be created.

    Run the following command to define a new MergeTree table named clickstream in the gettingstarted database:

    CREATE TABLE gettingstarted.clickstream (
        customer_id String, 
        time_stamp Date, 
        click_event_type String,
        country_code FixedString(2), 	
        source_id UInt64
    ) 
    ENGINE = MergeTree()
    ORDER BY (time_stamp)
    

The table engine determines how and where the data is stored, which queries are supported, support for concurrency, and other details that you will need to gradually understand as you work with ClickHouse. For now, we will use the MergeTree engine - a good starting point when you are not sure which engine you need.

  1. Verify clickstream was created successfully:

    DESCRIBE gettingstarted.clickstream
    

    Your table should look like the following:

Visit the docs for more details on the various ClickHouse data types, but here are a few notes about the data types in your new table:

  • The String type replaces VARCHAR, BLOB, CLOB and other string-like data types from other databases
  • UInt64 is a 64-bit unsigned integer
  • Date is one of several ways to store dates in ClickHouse
  • If you know the precise length of all strings in a column, then use the FixedString(n) data type

5. Inserting data into ClickHouse

Now that you have a table ready to go, let’s insert some data…

Show instructions
  1. A “typical” SQL INSERT can be used:

    INSERT INTO gettingstarted.clickstream 
    VALUES ('customer1', '2021-10-02', 'add_to_cart', 'US', 568239 ) 
    
  2. You can also specify column names (always a best practice):

    INSERT INTO gettingstarted.clickstream (customer_id, time_stamp, click_event_type) 
    VALUES ('customer2', '2021-10-30', 'remove_from_cart' ) 
    
  3. There is even an EXCEPT option for excluding columns:

    INSERT INTO gettingstarted.clickstream (* EXCEPT(country_code)) 
    VALUES ('customer3', '2021-11-07', 'checkout', 307493 ) 
    
  4. You should now have three rows in your table:

    SELECT * FROM gettingstarted.clickstream
    

  1. You probably have data in different formats that needs to be inserted into ClickHouse. For details on all the various supported input formats, check out the formats in the ClickHouse documentation.

  2. Play around with the Play UI and run a few SELECT commands to see how similar ClickHouse commands are with SQL. Here are a few examples to try out:

    SELECT * FROM gettingstarted.clickstream WHERE country_code = 'US'
    
    SELECT source_id FROM gettingstarted.clickstream WHERE time_stamp >= '2021-11-01'
    
    SELECT * FROM gettingstarted.clickstream WHERE time_stamp >= toDate(1633193802)
    

  • The toDate function converts an epoch timestamp into a Date object

6. The ClickHouse Client

The clickhouse binary includes a client app that connects to a ClickHouse server and allows you to execute SQL commands from a command line or shell script. The client is easy to start…

Show instructions
  1. If you are not using Docker, then run the following command from within the clickhouse folder where you downloaded the binary:

    ./clickhouse client 
    

    If you are using Docker, execute the following command to download and run the ClickHouse client image:

    docker run -it --rm --link my-clickhouse-server:clickhouse-server clickhouse/clickhouse-client --host clickhouse-server
    
  2. You should see the prompt for the ClickHouse client:

    $ ./clickhouse client
    ClickHouse client version 21.11.1.8277 (official build).
    Connecting to localhost:9000 as user default.
    Connected to ClickHouse server version 21.11.1 revision 54449.
    
    my-host :)
    
  3. Now you can run queries as expected with any database client:

    SELECT * FROM gettingstarted.clickstream WHERE click_event_type = 'checkout'
    
  4. To exit the client, enter exit:

    my-host :) exit
    Bye.
    

If you have SQL commands stored in a file, you can pipe the file into the ClickHouse client:

cat my.sql | ./clickhouse client -mn

Congratulations!! You have now started your ClickHouse journey…


What’s next: Welcome to the ClickHouse neighborhood! Check out the following lessons to continue your journey:


We hope you found this learning module helpful and informative! Please feel free to contact us at [email protected] with any issues, questions, feedback, or ideas for future learning modules.