Welcome to Our House!

Come on in and stay a while!! We would love to introduce ourselves and say "Hello".

No thanks

You must have cookies enabled to view this lesson.

Analyzing Covid-19 Data with Grafana and ClickHouse


Duration: 20 minutes

Audience: Anyone interested in learning how to connect Grafana to ClickHouse for building visualizations and dashboards

Issues/Questions: Contact us at [email protected]

Overview: In this lesson, you will learn how to analyze data in ClickHouse using Grafana. The data is already indexed for you, so the focus will be on how to configure Grafana to connect to ClickHouse.

Let’s get started!


Prerequisites

  • You will need Docker installed and running so that you can start up a Docker Compose file

1. Start ClickHouse and Grafana

You are going to run a preconfigured ClickHouse server in a Docker container that will already have the Covid-19 data inserted into a table, and in a different Docker container you are going to run an out-of-the-box Grafana instance.

Show instructions
  1. Let’s start by creating a local folder to work in (feel free to name the folder anything you like):

    mkdir ~/clickhouse-covid19
    cd ~/clickhouse-covid19
    
  2. Create a new file named docker-compose.yml, and copy-and-paste the following into it:

    version: '3.7'
    services:
      clickhouse-covid19:
        image: learnclickhouse/public-repo:clickhouse-covid19-21.9
        container_name: clickhouse-covid19
        hostname: clickhouse-covid19
        ports:
          - "9000:9000"
          - "8123:8123"
          - "9009:9009"
        restart: always
        ulimits:
          memlock:
            soft: -1
            hard: -1
          nofile:
            soft: 262144
            hard: 262144
        deploy:
          resources:
            limits:
              memory: 2g
    
      grafana:
        image: grafana/grafana:8.1.5-ubuntu
        container_name: grafana
        hostname: grafana
        ports:
          - "3000:3000"
        environment:
          - GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS=vertamedia-clickhouse-datasource
        restart: always
        deploy:
          resources:
            limits:
              memory: 2g
    

As you can see, the grafana image is just being pulled from Grafana’s Docker Hub, so we are starting with a brand new instance of Grafana. On the other hand, clickhouse-covid19-21.9 is a special image built just for this lesson that inserts the Covid-19 data into a table on startup.

  1. From a terminal, run the following command from the folder where you created docker-compose.yml:

    docker-compose up
    
  2. It will take a minute - the entrypoint script has a delay to ensure that ClickHouse starts up before the database is created and populated. After a minute, open the Play UI and run the following command:

    SELECT count(*) FROM covid19db.daily_totals
    

You should see 118,603 rows:


2. View the Covid-19 data

Now that you have ClickHouse running, let’s take a look at the Covid-19 data that was inserted.

Show instructions
  1. The covid19db database has a single table named daily_totals. From the Play UI, run the following command to view the schema of daily_totals:

    DESCRIBE covid19db.daily_totals
    
  2. Each row contains daily Covid-19 numbers from various countries. For example, the following query to view some rows:

    SELECT * FROM covid19db.daily_totals LIMIT 100
    
  3. The following query shows the number of new cases by day in the United States, with the highest totals first:

    SELECT new_cases, date 
    FROM covid19db.daily_totals 
    WHERE location = 'United States' 
    ORDER BY new_cases DESC
    

Feel free to run some queries to get a better sense of the data - it has a lot of columns.


3. Install the Grafana Plugin for ClickHouse

Before Grafana can talk to ClickHouse, you need to install the appropriate Grafana plugin…

Show instructions
  1. Login to Grafana at http://localhost:3000/. The username and password are both admin. You will be prompted to change the password - but notice there is a link to skip that step if desired.

  1. Select the Configuration menu (the gear icon in the left column) and select Data sources. Select the Add data source button and search for ClickHouse - notice the list is empty:

  1. You need to install the Grafana plugin for ClickHouse, which can be done from the command line. Run the following command to connect to the grafana Docker container:

    docker exec -it grafana /bin/bash
    
  2. At the prompt, install the plugin with the following command:

    grafana-cli plugins install vertamedia-clickhouse-datasource
    
  3. Notice you need to restart Grafana before the plugin is available. Start by typing in exit to get out of the Docker container:

    exit
    
  4. Now run the following command to restart the Grafana container:

    docker restart grafana
    
  5. To verify the plugin is working, go back to Grafana and reload the page for defining data sources. This time when you select the Add data source button - you should see ClickHouse in the list:

In the next step, you will define a new data source for ClickHouse.


4. Configure a ClickHouse data source in Grafana

Now that you have the ClickHouse plugin installed, let’s define a data source in Grafana that connects to the covid19db database.

Show instructions
  1. From the Add data source page in Grafana, click the Select button next to ClickHouse. The following dialog appears:

  1. Enter the following values:
  • Name: my-clickhouse-ds
  • URL: http://clickhouse-covid19:8123

and make sure the Default option is selected.

  1. Scroll down and click the Save and test button. You should see a Data source is working message:

  1. Click the Back button and your new data source should appear on the list:

You are now ready to build a dashboard!


5. Build a Dashboard

Now that you have a data source configured, let’s build a dashboard…

Show instructions
  1. From the menu, click on the Dashboards menu and select the Manage icon. Then select the New Dashboard button:

  1. Dashboards are initially empty. Click the Add an empty panel button to create a new panel.

  2. Using the time picker, change the time interval to the last 2 years:

  1. Next you will select a database table. Make sure the Data source is my-clickhouse-ds, then click the Edit icon (the one that looks like a pencil):

  1. Modify the query as follows:
  • select covid19db for the database
  • select daily_totals for the table
  • select Column:DateTime64 for the data type and time_stamp as the field for the time selector

  1. Click the Go to Query button to view the query. Notice the metric is a simple count(), and the line chart should populate with the number of daily events. Change the name of the panel to Number of Events then click the Apply button:

  1. The panel will appear on your new dashboard:

  1. Add another panel, but this time change the query to show the number of new cases every day:

  1. Name the panel Daily New Cases and click Apply to add it to your dashboard:

  1. You can add multiple query results on the same panel. From the dashboard, click on the name Daily New Cases and select Edit from the drop-down menu to return back to the Edit panel page.

  2. Click the + Query button below the first query:

  1. Instead of count(), compute the SUM(new_tests_smoothed) for query B:

  1. Change the name to Daily New Cases vs. Tests and click Apply to view the updated panel:

  1. By the way, you can save your dashboard by clicking the Save dashboard icon in the top-right toolbar - you will be prompted for a name as well.

  2. Feel free to add some more panels and experiment with other visualizations like pie charts, bar charts, heat maps and all the other fun features of Grafana.

Congratulations on connecting Grafana to ClickHouse!! You have opened up a whole new world of blazing fast data analytics.


What’s next: Check out the following lessons to continue your journey:


We hope you found this learning module helpful and informative! Please feel free to contact us at [email protected] with any issues, questions, feedback, or ideas for future learning modules.