Welcome to Our House!
Come on in and stay a while!! We would love to introduce ourselves and say "Hello".
Connecting Metabase to ClickHouse
Duration: 20 minutes
Audience: Metabase users interested in connecting Metabase to ClickHouse for answering questions
Issues/Questions: Contact us at [email protected]
Overview: In this lesson, you will analyze data that has details about Spotify usage (the artists and how many daily streams of their songs occurred). The data is already indexed for you, so the focus will be on configuring Metabase to connect to ClickHouse.
According to their website, Metabase brings your data to life in beautiful visualizations with an intuitive interface that makes data exploration feel like second nature — opening data up for everyone, not just analysts and developers.
After setting it all up, I have to agree! Metabase approaches visualizations from the concept of answering questions, instead of building charts and graphs. It is fairly intuitive, no SQL is needed, and the UI feels smooth and responsive.
Let’s get started!
- You will be running a couple of Docker images, so you will need Docker installed if you want to follow along.
1. Startup ClickHouse
We have built a Docker image that already has ClickHouse installed, along with a table that contains the Spotify data:
- Run the appropriate command for your environment to startup the Docker container:
On Linux, you will need to use the
docker run -it --name clickhouse-spotify --network host -p 9000:9000 -p 9009:9009 -p 8123:8123 --platform linux/amd64 --ulimit nofile=262144:262144 learnclickhouse/public-repo:clickhouse-spotify-21.10
On all other environments, use this command:
docker run -it --name clickhouse-spotify -p 9000:9000 -p 9009:9009 -p 8123:8123 --platform linux/amd64 --ulimit nofile=262144:262144 learnclickhouse/public-repo:clickhouse-spotify-21.10
Wait about 30 seconds for the clickhouse-spotify container to startup and also for the data to get inserted into the spotify database.
Point your web browser to http://localhost:8123/play. You should see the embedded ClickHouse Play UI:
Let’s run a few queries to understand what the dataset looks like. Copy-and-paste the following query into the UI, then click the Run button (or press Ctrl/Cmd+Enter):
SHOW TABLES IN spotify
You should see a table named songs.
The following command shows the schema of songs:
View some of the data in the tables. This query displays 100 days' worth of streaming data:
SELECT * FROM spotify.songs LIMIT 100
To see the most popular songs, sort by the Streams column:
SELECT * FROM spotify.songs ORDER BY Streams DESC
It looks like 2017 was a good year Kendrick Lamar, Post Malone, Taylor Swift and Drake.
2. Download the ClickHouse plugin for Metabase
Now that you have Metabase running, it requires a plugin to connect to ClickHouse.
The plugin is a JAR file that needs to end up in the plugins folder of your Metabase deployment. For this tutorial, you will download the driver and mount it to the Docker container. Create a new folder to work in:
mkdir -p ~/metabase/plugins chown 777 ~/metabase/plugins
For this tutorial we used version 0.7.5 from the Metabase ClickHouse Driver repository on GitHub. Download the JAR file using this URL:
Save it in the ~/metabase/plugins folder.
That’s it! You will use the plugin in the next step…
3. Startup Metabase
Metabase has a nice Docker image for running Metabase locally.
It appears that the Docker image from Metabase does not run on the new M1 Macs, so if you are following along you will need to use a different platform, or run a local Metabase instance outside of Docker.
- Use the mount flag to specify the location of the plugins folder. You must use the full path to your new plugins folder. In the following command, replace username with the correct path to your plugins folder.
On Linux, the command will look like:
docker run --network host -d -p 3000:3000 \ --mount type=bind,source=/home/username/metabase/plugins,destination=/plugins \ --name metabase metabase/metabase
On Mac the command is:
docker run -d -p 3000:3000 \ --mount type=bind,source=/Users/username/metabase/plugins,destination=/plugins \ --name metabase metabase/metabase
It takes a minute for Metabase to startup, but if you run the following command you can watch the progress:
docker logs -f metabase
Once Metabase is running, access it at http://localhost:3000. You will see the welcome screen:
- You will be asked a bunch of questions - so work your way through the initial setup wizard. When prompted to select a database, you need to click on “I’ll add my data later":
- Complete the setup to work your way into the app:
4. Connect Metabase to ClickHouse
Let’s see how to connect Metabase to the Spotify database in ClickHouse.
Click on the gear icon in the top-right corner and select Admin to visit your Metabase Admin page.
Click on Add a database:
- If your driver installation worked, you will see ClickHouse in the Database type. Select ClickHouse and add the following settings (leave the password field blank):
- Select the Save button at the bottom and you should see a confirmation that the database was added:
- Click the Explore this data button and Metabase will scan your database and build some visualizations based on the content. When it’s ready, click on A look at your Songs table:
- Notice the default dashboard pulled some of the more obvious metrics from the songs table:
- Save the auto-generated dashboard by clicking the Save this button. The dashboard will be saved as A look at your Songs table.
5. Build a Dashboard in Metabase
Let’s build a dashboard of our own.
- Click the large + icon in the top-right corner and select New dashboard. Name it Spotify Dashboard:
- Metabase cleverly refers to their visualizations as questions. To add a visualization, select the Ask a question button in the top-right area of the toolbar:
- Let’s display a simple table that lists the artists with the most number of streams. Select Simple question and you will be prompted for a data source. Select My ClickHouse Database, then select the Songs table:
- You will see a table view of some of the records. We need to group by artist to answer our question, so click the Summarize button:
- Change the metric to Sum and select the Streams field. (You will see a sum of all streams!) Select Artist to Group by and a table view of all streams by artist appears:
- Notice the ordering is currently by artist. Click on the top of the Sum of Streams column and change it to sort descending, then click the Save button:
- Give your question a name:
- When prompted, save the question to your Spotify Dashboard and you will be taken to that dashboard:
- Following the same steps, create a line chart that answers “What are the average number of streams per day?":
- Create a simple metric for the total number of streams all-time:
Let’s figure out how Taylor Swift did for the month of October, 2017. Ask a new question, but select Custom question this time. Select the Songs table from your ClickHouse database.
You will see the following screen. Add two filters: Artist is Taylor Swift, and Date field is between 10/01/2017 and 10/31/2017. Click the Summarize button and change the metric to be the sum of the Streams field, then group the results by day:
- Select the Visualize button to view the resulting line chart:
Save and add the line chart to your Spotify dashboard.
Let’s answer more question: which artists have had the most streams on any given day in 2017? This can be answered with an ANY query. Start by selecting Ask a question, then select Native query this time.
Select My ClickHouse Database.
Copy-and-paste the following query into the text area:
SELECT DISTINCT Artist FROM spotify.songs WHERE Streams = ANY ( SELECT max(Streams) FROM spotify.songs GROUP BY Date )
Click the Play button (see the screenshot above) and the results will appear in a table.
Save it to your Spotify dashboard:
- Now that have a feel for both Metabase and the Spotify dataset, feel free to play around and answer as many questions as you can.
Congratulations on connecting Metabase to ClickHouse!! We hope you enjoyed this lesson and found it useful and informative.
What’s next: Check out the following lessons to continue your journey:
- The Ingest Nginx Logs into ClickHouse using Vector lesson demonstrates how to stream a log file into ClickHouse
- Check out What’s New in ClickHouse 21.10
- View all of our lessons on the Learn ClickHouse home page