Skip to main content

Connect Superset to ClickHouse

Apache Superset is an open-source data exploration and visualization platform written in Python. Superset connects to ClickHouse using a Python driver with a SQLAlchemy dialect. Let's see how it works...

1. Install the Drivers

  1. Superset uses the clickhouse-sqlalchemy driver, which requires the clickhouse-driver to connect to ClickHouse. The details of clickhouse-driver are at https://pypi.org/project/clickhouse-driver/ and can be installed with the following command:

    pip install clickhouse-driver 
  2. Now install the ClickHouse SQLAlchemy driver:

    pip install clickhouse-sqlalchemy
  3. Start (or restart) Superset.

2. Connect Superset to ClickHouse

  1. Within Superset, select Data from the top menu and then Databases from the drop-down menu. Add a new database by clicking the + Database button:

    Add a new database
  2. In the first step, select ClickHouse as the type of database:

    Select ClickHouse
  3. In the second step, enter a display name for your database and the connection URI. The DISPLAY NAME can be any name you prefer. The SQLALCHEMY URI is the important setting - it has the following format:

    clickhouse+native://username:[email protected]/database_name

    In the example below, ClickHouse is running on localhost with the default user and no password. The name of the database is covid19db. Use the TEST CONNECTION button to verify that Superset is connecting to your ClickHouse database properly:

    Test the connection
  4. Click the CONNECT button to complete the setup wizard, and you should see your database in the list of databases.

3. Add a Dataset

  1. To define new charts (visualizations) in Superset, you need to define a dataset. From the top menu in Superset, select Data, then Datasets from the drop-down menu.

  2. Click the button for adding a dataset. Select your new database as the datasource and you should see the tables defined in your database. For example, the covid19db database has a table named daily_totals:

    New dataset
  1. Click the ADD button at the bottom of the dialog window and your table appears in the list of datasets. You are ready to build a dashboard and analyze your ClickHouse data!

4. Creating charts and a dashboard in Superset

If you are familiar with Superset, then you will feel right at home with this next section. If you are new to Superset, well...it's like a lot of the other cool visualization tools out there in the world - it doesn't take long to get started, but the details and nuances get learned over time as you use the tool.

  1. You start with a dashboard. From the top menu in Superset, select Dashboards. Click the button in the upper-right to add a new dashboard. The following dashboard is named Covid-19 Dashboard:

    New dashboard
  2. To create a new chart, select Charts from the top menu and click the button to add a new chart. You will be shown a lot of options. The following example shows a Big Number chart using the daily_totals dataset from the CHOOSE A DATASET drop-down:

    New chart
  3. You need to add a metric to a Big Number. The column named DATA and the section named Query with a METRIC field show a red warning because they are not defined yet. To add a metric, click Add metric and a small dialog window appears:

    Add a metric
  4. The following example uses the SUM metric, found on the the SIMPLE tab. It sums the values of the new_cases column:

    The SUM metric
  5. To view the actual number, click the RUN QUERY button:

    Run the query
  6. Click the SAVE button to save the chart, then select Covid-19 Dashboard under the ADD TO DASHBOARD drop-down, then SAVE & GO TO DASHBOARD saves the chart and adds it to the dashboard:

    Add Chart to Dashboard
  7. That's it. Building dashboards in Superset based on data in ClickHouse opens up a whole world of blazing fast data analytics!

    New Dashboard