Filling gaps in time-series data

When working with time-series data, there can be gaps in the data due to missing data or inactivity. Typically, we don't want those gaps to exist when we query the data. In this case, the WITH FILL clause can come in handy. This guide discusses how to use WITH FILL to fill gaps in your time-series data.

Setup

Imagine we've got the following table that stores metadata on images generated by a GenAI image service:

Let's import some records:

Querying by bucket

We're going to explore the images created between 00:24:03 and 00:24:04 on the 24th March 2023, so let's create some parameters for those points in time:

Next, we'll write a query that groups the data into 100ms buckets and returns the count of images created in that bucket:

The result set only includes the buckets where an image was created, but for time-series analysis, we might want to return each 100ms bucket, even if it doesn't have any entries.

WITH FILL

We can use the WITH FILL clause to fill in these gaps. We'll also specify the STEP, which is the size of the gaps to fill. This defaults to 1 second for DateTime types, but we'd like to fill gaps of 100ms in length, so let's an interval of 100ms as our step value:

We can see that the gaps have been filled with 0 values in the count column.

WITH FILL...FROM

There is, however, still a gap at the beginning of the time range, which we can fix by specifying FROM:

We can see from the results that the buckets from 00:24:03.000 to 00:24:03.500 all now appear.

WITH FILL...TO

We're still missing some buckets from the end of the time range though, which we can fill by providing a TO value. TO is not inclusive, so we'll add a small amount to the end time to make sure that it's included:

The gaps have all now been filled and we have entries for every 100 ms from 00:24:03.000 to 00:24:05.000.

Cumulative count

Let's say we now want to keep a cumulative count of the number of images created across the buckets. We can do this by adding a cumulative column, as shown below:

The values in the cumulative column aren't working how we'd like them to.

WITH FILL...INTERPOLATE

Any rows that have 0 in the count column also have 0 in the cumulative column, whereas we'd rather it use the previous value in the cumulative column. We can do this by using the INTERPOLATE clause, as shown below:

That looks much better. And now to finish it off, let's add a bar chart using the bar function, not forgetting to add our new column to the INTERPOLATE clause.

Setup​

Querying by bucket​

WITH FILL​

WITH FILL...FROM​

WITH FILL...TO​

Cumulative count​

WITH FILL...INTERPOLATE​