Machine learning and GenAI

The ultimate real-time database to power Machine Learning workloads. With ClickHouse, it's easier than ever to unleash GenAI on your analytics data.

  • Simplify your data stack by eliminating the need for ML-specific data stores
  • Use lightning-fast aggregations for data preparation, powering model training at petabyte scale
  • Execute fast and efficient vector search with linear and approximate techniques
  • Plug-and-play pre-built models directly, from any provider
  • Develop with the ML tools you already love through our extensive suite of integrations

Agentic Data Stack

Unlock Agent-Facing Analytics within the ClickHouse Cloud console or via the native remote MCP server, and observe your agents with Langfuse

  • AI Assistant

  • AI Agent

  • Remote MCP Server

  • Docs AI

Find out why companies are using ClickHouse to power their AI workloads.

Best-in-class ingestion rates built to handle continuous streams of data so you can rely on the most up-to-date information to fuel accurate predictions and results.

Unparalleled query performance at scale. Query billions of rows in milliseconds. Reduce iteration times and maximize efficiency with your data.

Powerful automatic scaling that's designed to handle unpredictable workloads. Focus on machine learning without worrying about your infrastructure.

Available as an in-process OLAP SQL engine for Python. Leverage the full power of ClickHouse, directly in your Python code with chDB.

Trusted by developers that work with data at scale

metatrusted by ebaytrusted by microsofttrusted by spotifytrusted by lyfttrusted by hubspotlangchain logo 1instacart logo 1trusted by contentsquaretrusted by ciscotrusted by nginxtrusted by cloudflaretrusted by rokttrusted by fiskertrusted by trackingplantrusted by adevintatrusted by ibmtrusted by gitlabtrusted by muxtrusted by netapptrusted by servicenowtrusted by posthogweights and biases fulltrusted by deutsche bankvimeo logo userstoriesklaviyo blacknansenpoolsidetrusted by sonyblock logo black minvercelrampcognitivtrusted by statsigdidiprefect logocursortrip com logo 1sierrashopee logo 1commonroom logoteckion logo 1whatnotfastly carouselastonomer d6b1876f31corsearch logoupollo 1ampelectrumspotonmskcc 159a51fd5dazur gamesvantage logo blacksolarwinds 76ad8bad7flongbridge logo a5b9c10d39harvey d075e8174crbc logopoizon c9aa5431c5rea logo 830fcadc53anthropic smallcharacter 3a9027ce90
metatrusted by ebaytrusted by microsofttrusted by spotifytrusted by lyfttrusted by hubspotlangchain logo 1instacart logo 1trusted by contentsquaretrusted by ciscotrusted by nginxtrusted by cloudflaretrusted by rokttrusted by fiskertrusted by trackingplantrusted by adevintatrusted by ibmtrusted by gitlabtrusted by muxtrusted by netapptrusted by servicenowtrusted by posthogweights and biases fulltrusted by deutsche bankvimeo logo userstoriesklaviyo blacknansenpoolsidetrusted by sonyblock logo black minvercelrampcognitivtrusted by statsigdidiprefect logocursortrip com logo 1sierrashopee logo 1commonroom logoteckion logo 1whatnotfastly carouselastonomer d6b1876f31corsearch logoupollo 1ampelectrumspotonmskcc 159a51fd5dazur gamesvantage logo blacksolarwinds 76ad8bad7flongbridge logo a5b9c10d39harvey d075e8174crbc logopoizon c9aa5431c5rea logo 830fcadc53anthropic smallcharacter 3a9027ce90

ClickHouse for ML & AI

ClickHouse is purpose-built to make deriving insights from complex data effortless. No matter how much data you're working with. Whether you're extracting valuable information for model training and evaluation through aggregations, running inference through our User Defined Functions, or performing vector search, ClickHouse enables you to maximize data efficiency and unlock the power of AI for any application.

Raw files DatastoresServersand otherdevicesAppsData sources

ClickHouse is trusted at scale to ingest and process billions of new events per day from a wide range of sources and formats. For continuous streams of data, ClickPipes seamlessly manages your ingestion pipelines so that you don't have to.

Features like User Defined Functions, described in more depth below, can be used to invoke models at insert time. This gives you the ability to pass incoming data to a model, receive the output, and store these results along with your ingested data. All without having to spin up other processes or jobs.

Native table functions make it easy to query data wherever it lives, whether locally or in object stores such as GCS and S3, or applying transformations via services like HuggingFace.

ClickHouse User Defined Functions give you the flexibility to run Python scripts - or whichever executable language you prefer - directly in ClickHouse. These scripts can be triggered at insert or query time, making it easy to invoke pre-built models from providers like OpenAI and HuggingFace, or your own.

Our extensive suite of statistical and aggregation functions scale seamlessly over petabytes of data, providing powerful model training and evaluation resources. With support for the most granular precision data types and codecs, you don't need to worry about reducing granularity.

With ClickHouse, executing vector searches using linear or approximate techniques is effortless, with out-of-the-box support and blazing speed.

ClickHouse is trusted all over the world to power customer-facing applications, where real-time responsiveness is critical.

With ClickHouse, you have everything you need to enrich your customer experiences through machine learning workloads run on your data, all in one place.

Our vibrant and growing ecosystem of integrations makes it easy to leverage your notebooks, visualization tools, and more, directly with ClickHouse.

Create valuable experiences and insights

Whether you're building engaging personalization features, incorporating semantic search into your product, generating summarized insights from raw content automatically, or more, ClickHouse exposes the features you need to build AI-powered functionality with your data.

Unify your data stack

Eliminate the need for specialized data stores used for specific machine learning tasks, such as vector search. With ClickHouse, you can rely on one, unified, data store to power your analytics, run your machine learning workloads, and manage your ad-hoc querying, all in one place.

Manage data efficiently

ClickHouse's efficient management of resources helps maximize cost-effectiveness. Our column-oriented design delivers best-in-class compression ratios, reducing storage burden and ensuring blazing speed for even the most intensive ML workloads.

Use the tools you love

Leverage ClickHouse directly with your favorite ML tools. Our growing community of integrations includes popular machine learning frameworks, visualization tools, notebooks, and more.

Get started with ClickHouse Cloud for free

We'll get you started on a 30 day trial and $300 credits to spend at your own pace.