The SparseGrams function

Mark Needham

Learn about ClickHouse's sparse grams function and how it improves upon traditional n-grams to build better search solutions. This tutorial walks through the concept step-by-step, explaining how sparse grams work by using weighted substrings to filter out common patterns that would otherwise return too many search results. We'll explore the algorithm with practical examples and show you how to use it in ClickHouse.

What You'll Learn:

  • The limitations of traditional n-grams for search indexing at scale
  • How GitHub's sparse grams algorithm solves the "too many results" problem
  • Step-by-step walkthrough of the sparse grams weighting system
  • How to use ClickHouse's sparseGrams() function with practical examples
  • Understanding the crc32 hash function for weight calculations
  • Comparing n-grams vs sparse grams output side-by-side

➡️ SparseGrams docs

Recent videos

YouTube Video: GwCRcRa8f3A

Open House

Open House 2026: Day 1 Keynote

The latest ClickHouse announcements, featuring real-world use cases from Shopify, Zoox, Visa, and Cisco.

YouTube Video: ZtvlCz7Ukg4

Open House

Fireside Chat: The state of data and AI with Bret Taylor (Sierra) and Aaron Katz (ClickHouse)

Aaron Katz (CEO, ClickHouse) and Bret Taylor (Co-Founder Sierra, Chairman of the Board, OpenAI) have an open conversation on the state of AI.

YouTube Video: FmS7VopaqNg

Open House, ClickHouse

How to build a great database (Alexey Milovidov)

The principles behind building a great database, and the new frontiers shaping the field.

Follow us

XBlueskySlackGithubTelegramMeetupRSS