Videos

The SparseGrams function

Mark Needham

Learn about ClickHouse's sparse grams function and how it improves upon traditional n-grams to build better search solutions. This tutorial walks through the concept step-by-step, explaining how sparse grams work by using weighted substrings to filter out common patterns that would otherwise return too many search results. We'll explore the algorithm with practical examples and show you how to use it in ClickHouse.

What You'll Learn:

  • The limitations of traditional n-grams for search indexing at scale
  • How GitHub's sparse grams algorithm solves the "too many results" problem
  • Step-by-step walkthrough of the sparse grams weighting system
  • How to use ClickHouse's sparseGrams() function with practical examples
  • Understanding the crc32 hash function for weight calculations
  • Comparing n-grams vs sparse grams output side-by-side

➡️ SparseGrams docs

Follow us
X imageBluesky imageSlack image
GitHub imageTelegram imageMeetup image
Rss image