Welcome to Our House!

Come on in and stay a while!! We would love to introduce ourselves and say "Hello".

No thanks

You must have cookies enabled to view this lesson.

Connecting Metabase to ClickHouse


Duration: 20 minutes

Audience: Metabase users interested in connecting Metabase to ClickHouse for answering questions

Issues/Questions: Contact us at [email protected]

Overview: In this lesson, you will analyze data that has details about Spotify usage (the artists and how many daily streams of their songs occurred). The data is already indexed for you, so the focus will be on configuring Metabase to connect to ClickHouse.

According to their website, Metabase brings your data to life in beautiful visualizations with an intuitive interface that makes data exploration feel like second nature — opening data up for everyone, not just analysts and developers.

After setting it all up, I have to agree! Metabase approaches visualizations from the concept of answering questions, instead of building charts and graphs. It is fairly intuitive, no SQL is needed, and the UI feels smooth and responsive.

Let’s get started!


Prerequisites

  • You will be running a couple of Docker images, so you will need Docker installed if you want to follow along.

1. Startup ClickHouse

We have built a Docker image that already has ClickHouse installed, along with a table that contains the Spotify data:

Show instructions
  1. Run the appropriate command for your environment to startup the Docker container:
  • On Linux, you will need to use the --network host option:

    docker run -it  --name clickhouse-spotify --network host -p 9000:9000 -p 9009:9009 -p 8123:8123 --platform linux/amd64 --ulimit nofile=262144:262144 learnclickhouse/public-repo:clickhouse-spotify-21.10
    

    On all other environments, use this command:

    docker run -it  --name clickhouse-spotify -p 9000:9000 -p 9009:9009 -p 8123:8123 --platform linux/amd64 --ulimit nofile=262144:262144 learnclickhouse/public-repo:clickhouse-spotify-21.10
    
  1. Wait about 30 seconds for the clickhouse-spotify container to startup and also for the data to get inserted into the spotify database.

  2. Point your web browser to http://localhost:8123/play. You should see the embedded ClickHouse Play UI:

  1. Let’s run a few queries to understand what the dataset looks like. Copy-and-paste the following query into the UI, then click the Run button (or press Ctrl/Cmd+Enter):

    SHOW TABLES IN spotify
    

    You should see a table named songs.

  2. The following command shows the schema of songs:

    DESCRIBE spotify.songs
    
  3. View some of the data in the tables. This query displays 100 days' worth of streaming data:

    SELECT * FROM spotify.songs LIMIT 100
    
  4. To see the most popular songs, sort by the Streams column:

    SELECT * FROM spotify.songs 
    ORDER BY Streams DESC
    

Top streams for Spotify

It looks like 2017 was a good year Kendrick Lamar, Post Malone, Taylor Swift and Drake.


2. Download the ClickHouse plugin for Metabase

Now that you have Metabase running, it requires a plugin to connect to ClickHouse.

Show instructions
  1. The plugin is a JAR file that needs to end up in the plugins folder of your Metabase deployment. For this tutorial, you will download the driver and mount it to the Docker container. Create a new folder to work in:

    mkdir -p ~/metabase/plugins
    chown 777 ~/metabase/plugins
    
  2. For this tutorial we used version 0.7.5 from the Metabase ClickHouse Driver repository on GitHub. Download the JAR file using this URL:

    https://github.com/enqueue/metabase-clickhouse-driver/releases/tag/0.7.5
    
  3. Save it in the ~/metabase/plugins folder.

That’s it! You will use the plugin in the next step…


3. Startup Metabase

Metabase has a nice Docker image for running Metabase locally.

Show instructions

It appears that the Docker image from Metabase does not run on the new M1 Macs, so if you are following along you will need to use a different platform, or run a local Metabase instance outside of Docker.

  1. Use the mount flag to specify the location of the plugins folder. You must use the full path to your new plugins folder. In the following command, replace username with the correct path to your plugins folder.
  • On Linux, the command will look like:

    docker run --network host -d -p 3000:3000 \
      --mount type=bind,source=/home/username/metabase/plugins,destination=/plugins \
      --name metabase metabase/metabase
    
  • On Mac the command is:

    docker run -d -p 3000:3000 \
      --mount type=bind,source=/Users/username/metabase/plugins,destination=/plugins \
      --name metabase metabase/metabase
    
  1. It takes a minute for Metabase to startup, but if you run the following command you can watch the progress:

    docker logs -f metabase
    
  2. Once Metabase is running, access it at http://localhost:3000. You will see the welcome screen:

Welcome to Metabase

  1. You will be asked a bunch of questions - so work your way through the initial setup wizard. When prompted to select a database, you need to click on “I’ll add my data later":

Add Database Later

  1. Complete the setup to work your way into the app:

Metabase Main Screen


4. Connect Metabase to ClickHouse

Let’s see how to connect Metabase to the Spotify database in ClickHouse.

Show instructions
  1. Click on the gear icon in the top-right corner and select Admin to visit your Metabase Admin page.

  2. Click on Add a database:

Add a database

  1. If your driver installation worked, you will see ClickHouse in the Database type. Select ClickHouse and add the following settings (leave the password field blank):

Connection settings

  1. Select the Save button at the bottom and you should see a confirmation that the database was added:

Database is added

  1. Click the Explore this data button and Metabase will scan your database and build some visualizations based on the content. When it’s ready, click on A look at your Songs table:

Metabot

  1. Notice the default dashboard pulled some of the more obvious metrics from the songs table:

Default dashboard

  1. Save the auto-generated dashboard by clicking the Save this button. The dashboard will be saved as A look at your Songs table.

5. Build a Dashboard in Metabase

Let’s build a dashboard of our own.

Show instructions
  1. Click the large + icon in the top-right corner and select New dashboard. Name it Spotify Dashboard:

New dashboard

  1. Metabase cleverly refers to their visualizations as questions. To add a visualization, select the Ask a question button in the top-right area of the toolbar:

Ask a question

  1. Let’s display a simple table that lists the artists with the most number of streams. Select Simple question and you will be prompted for a data source. Select My ClickHouse Database, then select the Songs table:

The songs table

  1. You will see a table view of some of the records. We need to group by artist to answer our question, so click the Summarize button:

Click the Summarize button

  1. Change the metric to Sum and select the Streams field. (You will see a sum of all streams!) Select Artist to Group by and a table view of all streams by artist appears:

Group by artist

  1. Notice the ordering is currently by artist. Click on the top of the Sum of Streams column and change it to sort descending, then click the Save button:

Save button

  1. Give your question a name:

Save question

  1. When prompted, save the question to your Spotify Dashboard and you will be taken to that dashboard:

Spotify dashboard

  1. Following the same steps, create a line chart that answers “What are the average number of streams per day?":

Average streams

  1. Create a simple metric for the total number of streams all-time:

Total streams

  1. Let’s figure out how Taylor Swift did for the month of October, 2017. Ask a new question, but select Custom question this time. Select the Songs table from your ClickHouse database.

  2. You will see the following screen. Add two filters: Artist is Taylor Swift, and Date field is between 10/01/2017 and 10/31/2017. Click the Summarize button and change the metric to be the sum of the Streams field, then group the results by day:

Taylor Swift

  1. Select the Visualize button to view the resulting line chart:

Taylor Swift

  1. Save and add the line chart to your Spotify dashboard.

  2. Let’s answer more question: which artists have had the most streams on any given day in 2017? This can be answered with an ANY query. Start by selecting Ask a question, then select Native query this time.

  3. Select My ClickHouse Database.

  4. Copy-and-paste the following query into the text area:

    SELECT DISTINCT Artist FROM spotify.songs 
    WHERE Streams = ANY (
        SELECT max(Streams) FROM spotify.songs GROUP BY Date
    ) 
    

Taylor Swift

  1. Click the Play button (see the screenshot above) and the results will appear in a table.

  2. Save it to your Spotify dashboard:

Taylor Swift

  1. Now that have a feel for both Metabase and the Spotify dataset, feel free to play around and answer as many questions as you can.

Congratulations on connecting Metabase to ClickHouse!! We hope you enjoyed this lesson and found it useful and informative.


What’s next: Check out the following lessons to continue your journey:


We hope you found this learning module helpful and informative! Please feel free to contact us at [email protected] with any issues, questions, feedback, or ideas for future learning modules.